General Relativity, Black Holes, And Cosmology

May 2, 2018 | Author: Daniel Ponce Martinez | Category: Special Relativity, Spacetime, General Relativity, Differential Topology, Space
Share Embed Donate


Short Description

Download General Relativity, Black Holes, And Cosmology...

Description

General Relativity, Black Holes, and Cosmology Andrew J. S. Hamilton

Contents

Preface Notation

1

page 1 6

PART ONE SPECIAL RELATIVITY Concept Questions What’s important in Special Relativity

9 11 13

Special Relativity 1.1 The postulates of special relativity 1.2 The paradox of the constancy of the speed of light 1.3 Paradoxes and simultaneity 1.4 Time dilation 1.5 Lorentz transformation 1.6 Paradoxes: Time dilation, Lorentz contraction, and the Twin paradox 1.7 The spacetime wheel 1.8 Scalar spacetime distance 1.9 4-vectors 1.10 Energy-momentum 4-vector 1.11 Photon energy-momentum 1.12 Abstract 4-vectors 1.13 What things look like at relativistic speeds 1.14 How to programme Lorentz transformations on a computer

14 14 16 17 18 20 22 23 26 27 28 30 31 32 33

PART TWO COORDINATE APPROACH TO GENERAL RELATIVITY Concept Questions What’s important?

35 37 39

iv

Contents

2

Fundamentals of General Relativity 2.1 The postulates of General Relativity 2.2 Existence of locally inertial frames 2.3 Metric 2.4 Basis gµ of tangent vectors 2.5 4-vectors and tensors 2.6 Covariant derivatives 2.7 Coordinate 4-velocity 2.8 Geodesic equation 2.9 Coordinate 4-momentum 2.10 Affine parameter 2.11 Affine distance 2.12 Riemann curvature tensor 2.13 Symmetries of the Riemann tensor 2.14 Ricci tensor, Ricci scalar 2.15 Einstein tensor 2.16 Bianchi identities 2.17 Covariant conservation of the Einstein tensor 2.18 Einstein equations 2.19 Summary of the path from metric to the energy-momentum tensor 2.20 Energy-momentum tensor of an ideal fluid 2.21 Newtonian limit

40 40 41 41 42 43 45 51 51 52 52 53 53 54 55 55 55 56 56 57 57 58

3



More on the coordinate approach 3.1 Weyl tensor 3.2 Evolution equations for the Weyl tensor 3.3 Geodesic deviation 3.4 Commutator of the covariant derivative revisited

59 59 59 61 62

4



Action principle 4.1 Principle of least action for point particles 4.2 Action for a test particle 4.3 Action for a charged test particle in an electromagnetic field 4.4 Generalized momentum 4.5 Hamiltonian 4.6 Derivatives of the action

65 66 67 68 69 69 70

PART THREE IDEAL BLACK HOLES Concept Questions What’s important?

71 73 75

Contents

v

5

Observational Evidence for Black Holes

76

6

Ideal 6.1 6.2 6.3

78 78 78 79

7

Schwarzschild Black Hole 7.1 Schwarzschild metric 7.2 Birkhoff’s theorem 7.3 Stationary, static 7.4 Spherically symmetric 7.5 Horizon 7.6 Proper time 7.7 Redshift 7.8 Proper distance 7.9 “Schwarzschild singularity” 7.10 Embedding diagram 7.11 Energy-momentum tensor 7.12 Weyl tensor 7.13 Gullstrand-Painlev´e coordinates 7.14 Eddington-Finkelstein coordinates 7.15 Kruskal-Szekeres coordinates 7.16 Penrose diagrams 7.17 Schwarzschild white hole, wormhole 7.18 Collapse to a black hole 7.19 Killing vectors 7.20 Time translation symmetry 7.21 Spherical symmetry 7.22 Killing equation

8

Reissner-Nordstr¨ om Black Hole 8.1 Reissner-Nordstr¨ om metric 8.2 Energy-momentum tensor 8.3 Weyl tensor 8.4 Horizons 8.5 Gullstrand-Painlev´e metric 8.6 Complete Reissner-Nordstr¨ om geometry 8.7 Antiverse: Reissner-Nordstr¨ om geometry with negative mass 8.8 Ingoing, outgoing 8.9 Mass inflation instability 8.10 Inevitability of mass inflation

Black Holes Definition of a black hole Ideal black hole No-hair theorem

80 80 81 81 82 83 84 84 85 85 85 86 86 86 87 88 89 90 91 92 92 92 93 95 95 96 96 97 97 98 100 100 101 103

vi

Contents 8.11 8.12 8.13 8.14 8.15

9

10

The black hole particle accelerator The X point Extremal Reissner-Nordstr¨ om geometry Reissner-Nordstr¨ om geometry with charge exceeding mass Reissner-Nordstr¨ om geometry with imaginary charge

104 104 105 106 106

Kerr-Newman Black Hole 9.1 Boyer-Lindquist metric 9.2 Oblate spheroidal coordinates 9.3 Time and rotation symmetries 9.4 Ring singularity 9.5 Horizons 9.6 Angular velocity of the horizon 9.7 Ergospheres 9.8 Antiverse 9.9 Closed timelike curves 9.10 Energy-momentum tensor 9.11 Weyl tensor 9.12 Electromagnetic field 9.13 Doran coordinates 9.14 Extremal Kerr-Newman geometry 9.15 Trajectories of test particles in the Kerr-Newman geometry 9.16 Penrose process 9.17 Constant latitude trajectories in the Kerr-Newman geometry 9.18 Principal null congruence 9.19 Circular orbits in the Kerr-Newman geometry

109 109 110 110 111 111 113 113 114 114 116 116 116 117 117 118 122 123 123 124

PART FOUR HOMOGENEOUS, ISOTROPIC COSMOLOGY Concept Questions What’s important?

133 135 137

Homogeneous, Isotropic Cosmology 10.1 Observational basis 10.2 Cosmological Principle 10.3 Friedmann-Robertson-Walker metric 10.4 Spatial part of the FRW metric: informal approach 10.5 Comoving coordinates 10.6 Spatial part of the FRW metric: more formal approach 10.7 FRW metric 10.8 Einstein equations for FRW metric

138 138 139 140 140 142 143 144 144

Contents 10.9 10.10 10.11 10.12 10.13 10.14 10.15 10.16 10.17 10.18

Newtonian “derivation” of Friedmann equations Hubble parameter Critical density Omega Redshifting Types of mass-energy Evolution of the cosmic scale factor Conformal time Looking back along the lightcone Horizon

vii 145 146 147 147 148 148 149 151 151 152

PART FIVE TETRAD APPROACH TO GENERAL RELATIVITY Concept Questions What’s important?

157 159 161

11

The tetrad formalism 11.1 Tetrad 11.2 Vierbein 11.3 The metric encodes the vierbein 11.4 Tetrad transformations 11.5 Tetrad Tensor 11.6 Raising and lowering indices 11.7 Gauge transformations 11.8 Directed derivatives 11.9 Tetrad covariant derivative 11.10 Relation between tetrad and coordinate connections 11.11 Torsion tensor 11.12 No-torsion condition 11.13 Antisymmetry of the connection coefficients 11.14 Connection coefficients in terms of the vierbein 11.15 Riemann curvature tensor 11.16 Ricci, Einstein, Bianchi 11.17 Electromagnetism

162 162 162 163 164 165 165 165 166 166 168 168 168 169 169 170 171 171

12



174 174 177 179 183 186

More on the tetrad formalism 12.1 Spinor tetrad formalism 12.2 Newman-Penrose tetrad formalism 12.3 Electromagnetic field tensor 12.4 Weyl tensor 12.5 Petrov classification of the Weyl tensor

viii

Contents 12.6 12.7

Raychaudhuri equations and the Sachs optical scalars Focussing theorem

187 189

13



The 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8

3+1 (ADM) formalism ADM tetrad Traditional ADM approach Spatial tetrad vectors and tensors ADM connections, gravity, and extrinsic curvature ADM Riemann, Ricci, and Einstein tensors ADM action ADM equations of motion Constraints and energy-momentum conservation

191 192 193 194 194 195 196 199 200

14



geometric algebra Products of vectors Geometric product Reverse The pseudoscalar and the Hodge dual Reflection Rotation A rotor is a spin- 21 object A multivector rotation is an active rotation 2D rotations and complex numbers Quaternions 3D rotations and quaternions Pauli matrices Pauli spinors Pauli spinors as scaled 3D rotors, or quaternions Spacetime algebra Complex quaternions Lorentz transformations and complex quaternions Spatial Inversion (P ) and Time Inversion (T ) Electromagnetic field bivector How to implement Lorentz transformations on a computer Dirac matrices Dirac spinors Dirac spinors as complex quaternions Non-null Dirac spinor — particle and antiparticle Null Dirac Spinor Chiral decomposition of a Dirac spinor Dirac equation

201 202 203 204 205 206 207 209 210 210 212 213 215 216 218 219 221 223 224 225 225 229 231 232 235 236 237 238

The 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 14.10 14.11 14.12 14.13 14.14 14.15 14.16 14.17 14.18 14.19 14.20 14.21 14.22 14.23 14.24 14.25 14.26 14.27

Contents 14.28 14.29 14.30 14.31 14.32 14.33 14.34 14.35 14.36 14.37

Antiparticles are negative mass particles moving backwards in time Dirac equation with electromagnetism CP T Charge conjugation C Parity reversal P Time reversal T Majorana spinor Covariant derivatives revisited General relativistic Dirac equation 3D Vectors as rank-2 spinors

ix 239 240 240 241 242 243 243 243 244 244

PART SIX BLACK HOLE INTERIORS Concept Questions What’s important?

247 249 250

15

Black hole waterfalls 15.1 Tetrads move through coordinates 15.2 Gullstrand-Painlev´e waterfall 15.3 Boyer-Lindquist tetrad 15.4 Doran waterfall

251 251 252 258 259

16

General spherically symmetric spacetime 16.1 Spherical spacetime 16.2 Spherical electromagnetic field 16.3 General relativistic stellar structure 16.4 Self-similar spherically symmetric spacetime

265 265 276 277 278

17

The 17.1 17.2 17.3 17.4

290 290 293 295 310

18

interiors of spherical black holes The mechanism of mass inflation The far future? Self-similar models of the interior structure of black holes Instability at outer horizon?

PART SEVEN GENERAL RELATIVISTIC PERTURBATION THEORY Concept Questions What’s important?

311 313 315

Perturbations and gauge transformations 18.1 Notation for perturbations 18.2 Vierbein perturbation 18.3 Gauge transformations

316 316 316 317

x

Contents 18.4 18.5 18.6 18.7 18.8 18.9 18.10 18.11 18.12 18.13

Tetrad metric assumed constant Perturbed coordinate metric Tetrad gauge transformations Coordinate gauge transformations Coordinate gauge transformation of Coordinate gauge transformation of Coordinate gauge transformation of Coordinate gauge transformation of Coordinate gauge transformation of Lie derivative

a coordinate scalar a coordinate vector or tensor a tetrad vector the vierbein the metric

317 317 318 319 319 320 320 321 321 322

19

Scalar, vector, tensor decomposition 19.1 Decomposition of a vector in flat 3D space 19.2 Fourier version of the decomposition of a vector in flat 3D space 19.3 Decomposition of a tensor in flat 3D space

323 323 324 325

20

Flat space background 20.1 Classification of vierbein perturbations 20.2 Metric, tetrad connections, and Einstein and Weyl tensors 20.3 Spinor components of the Einstein tensor 20.4 Too many Einstein equations? 20.5 Action at a distance? 20.6 Comparison to electromagnetism 20.7 Harmonic gauge 20.8 What is the gravitational field? 20.9 Newtonian (Copernican) gauge 20.10 Synchronous gauge 20.11 Newtonian potential 20.12 Dragging of inertial frames 20.13 Quadrupole pressure 20.14 Gravitational waves 20.15 Energy-momentum carried by gravitational waves

326 326 328 330 331 332 333 337 338 338 339 341 342 343 344 347

PART EIGHT COSMOLOGICAL PERTURBATIONS Concept Questions

349 351

21

An overview of cosmological perturbations

352

22



357 357 358 358

Cosmological perturbations in a flat Friedmann-Robertson-Walker background 22.1 Unperturbed line-element 22.2 Comoving Fourier modes 22.3 Classification of vierbein perturbations

Contents 22.4 22.5 22.6 22.7

Metric, tetrad connections, and Einstein tensor ADM gauge choices Conformal Newtonian gauge Synchronous gauge

xi 360 362 362 363

23

Cosmological perturbations: a simplest set of assumptions 23.1 Perturbed FRW line-element 23.2 Energy-momenta of ideal fluids 23.3 Diffusive damping 23.4 Equations for the simplest set of assumptions 23.5 Unperturbed background 23.6 Generic behaviour of non-baryonic cold dark matter 23.7 Generic behaviour of radiation 23.8 Regimes 23.9 Superhorizon scales 23.10 Radiation-dominated, adiabatic initial conditions 23.11 Radiation-dominated, isocurvature initial conditions 23.12 Subhorizon scales 23.13 Matter-dominated 23.14 Recombination 23.15 Post-recombination 23.16 Matter with dark energy 23.17 Matter with dark energy and curvature

364 364 364 367 368 370 371 372 373 373 376 379 380 381 382 383 384 385

24



387 388 388 389 390 392 394 395 396 396 398 399 402 403 403 405 405

Cosmological perturbations: a more careful treatment of photons and baryons 24.1 Lorentz-invariant spatial and momentum volume elements 24.2 Occupation numbers 24.3 Occupation numbers in thermodynamic equilibrium 24.4 Boltzmann equation 24.5 Non-baryonic cold dark matter 24.6 The left hand side of the Boltzmann equation for photons 24.7 Spherical harmonics of the photon distribution 24.8 Energy-momentum tensor for photons 24.9 Collisions 24.10 Electron-photon scattering 24.11 The photon collision term for electron-photon scattering 24.12 Boltzmann equation for photons 24.13 Diffusive (Silk) damping 24.14 Baryons 24.15 Viscous baryon drag damping 24.16 Photon-baryon wave equation

xii

Contents 24.17 24.18 24.19 24.20 24.21

25

Damping of photon-baryon sound waves Ionization and recombination Neutrinos Summary of equations Legendre polynomials

Fluctuations in the Cosmic Microwave Background 25.1 Primordial power spectrum 25.2 Normalization of the power spectrum 25.3 CMB power spectrum 25.4 Matter power spectrum 25.5 Radiative transfer of CMB photons 25.6 Integrals over spherical Bessel functions 25.7 Large-scale CMB fluctuations 25.8 Monopole, dipole, and quadrupole contributions to Cℓ 25.9 Integrated Sachs-Wolfe (ISW) effect

407 409 409 409 410 411 411 412 412 413 413 415 417 418 418

Preface

Illusory preface As of writing (May 2010), this book is incomplete. If you happen to discover this draft on the internet, you are welcome to it (and especially welcome to send me helpful advice and criticism). This book has been written during two semesters of teaching graduate general relativity at the University of Colorado, Boulder, during Spring 2008 and 2010. I hope to complete the book the next time I teach the course, which could possibly be Spring 2012. Meanwhile, I am vividly aware of the book’s shortcomings. The book is incomplete in many parts, and needs pruning in others. If the early chapters read more like notes than a book, that is true; I was some way into writing before I realised that a book was taking shape. Especially, the book is missing many planned figures. Many of the anticipated figures can be found at three websites: “Special Relativity” (http://casa.colorado.edu/~ajsh/sr/sr.shtml), “Falling into a Black Hole” (http://casa.colorado. edu/~ajsh/schw.shtml), and “Inside Black Holes” (http://jila.colorado.edu/~ajsh/insidebh/index. html). Although the book is incomplete, I have tried hard to keep mathematical errors from creeping in. If you find an error — especially in a minus sign or a factor — please let me know.

True preface This book is driven by one overriding question: “What do students want?” A fundamental premise of this book is that, in the field of general relativity, the number one thing, by far, that students want to learn about is black holes. And the second thing they want to learn about is the cosmic microwave background. This book is born in part out of frustration with the relentlessly left-brained character of so many texts on general relativity. I’m probably one of those left-brained characters myself. However, my experience in general relativistic visualization has convinced me that those interested in black holes from a mathematical perspective are vastly outnumbered by those fascinated by black holes for other reasons. Among those

2

Preface

reasons are that black holes are, like dinosaurs, awesomely powerful, and supremely mysterious. I worry that general relativity is (as I hear from students) often taught as if it were little more than tensor calculus. I fear that the abstract approach repels and culls our right-brained students until only the most left-leaning of our students remain to trasmit left-brained relativity to the next generation. Although I was fascinated by general relativity already as a graduate student, my active involvement with relativity was stimulated by students who insisted that I teach it. From the beginning it seemed obvious that the way to teach relativity was through visualization. Thus, through teaching, I began to do general relativistic visualizations of black holes. Initially the visualizations were simple animations, which I put together into a website “Falling into a Black Hole” in 1997 and 1998. The visualizations touched a chord with the outside world. In 2001/2 I had the privelege of spending a year’s sabbatical with the Denver Museum of Nature and Science, where I began developing the Black Hole Flight Simulator (BHFS). That sabbatical eventually led to a large-format immersive digital dome show “Black Holes: The Other Side of Infinity,” produced at the DMNS and directed by Tom Lucas. Premiering in 2006, that dome show has been distributed to some 40 digital domes worldwide. Since that time visualizations with the BHFS have appeared in several TV documentaries and in a number of exhibits. The experience of working with nonscience professionals has been, and continues to be, intensely enjoyable, and has sensitized me to the insidious cultural chasms that divide us in an increasingly specialized society. These experiences have left a prominent dent in my thinking. For example, I think that the highlight of special relativity is the question of what you see and experience when you pass through a scene at near the speed of light. Yet most texts scarcely mention the subject, if at all. Similarly, I think that the highlight of black holes is what (general relativity predicts) you see and experience when you fall into a black hole. Again, most texts hardly address the issue. Texts often do mention Penrose-Hawking singularity theorems. Yet few texts mention the all-important mass inflation instability discovered by E. Poisson & W. Israel (1990). The inflationary instability probably plays the central role in determining the interior structure of astronomically realistic black holes, and in particular in cutting off the wormhole and white hole connections to other universes that exist in the ideal Kerr geometry of a rotating black hole. Even E. Poisson (1994) “A Relativist’s Toolkit: The Mathematics of Black-Hole Mechanics” mentions the inflationary instability only in the problems at the end of the last chapter. An important goal of this book is to redress this hole in the teaching of general relativity. The second focus of this book, after black holes, is the Cosmic Microwave Background (CMB). The CMB offers a profound window on the genesis of our Universe. Observations of fluctuations in the CMB are in astonishing agreement with the predictions of general relativistic perturbation theory coupled with some well-understood physics and some less well-understood but neverthess successful ideas about inflation. For this book, the goal I set myself was to attempt the simplest possible treatment of CMB fluctuations that would yield a result that could be compared to observation. This is not an easy goal, since calculation of CMB fluctuations presents many technical challenges. I applaud recent texts such as M. P. Hobson, G. P. Efstathiou, & A. N. Lasenby (2006) “General Relativity: An Introduction for Physicists,” and T. Padmanabhan (2010) “Gravitation: Foundations and Frontiers,” which include chapters on the CMB power spectrum. General texts like Hobson et al., Padmanabhan, and the present book by no means replace specialized books on the Cosmology and the CMB, such as S. Dodelson (2002) “Modern Cosmology,” R. Durrer

Preface

3

(2008) “The Cosmic Microwave Background,” or D. H. Lyth & A. R. Liddle (2009) “The Primordial Density Perturbation.” However, for many students a course on general relativity may be the only opportunity they get to learn about the CMB, and I think that a modern course on general relativity should include a basic introduction to the CMB. Notwithstanding its intended focus on applications rather than mathematics, this is not an easy book. It is a serious graduate-level text. R. M. Wald (2006 “Teaching the mathematics of general relativity”, Am. J. Phys. 74, 471–477 http://arxiv.org/abs/gr-qc/0511073) describes the challenges of teaching the necessary mathematics in a course on general relativity. This book will not make climbing the mathematical mountain of general relativity any easier. But the intention is that this book will help you get a clear view from the top, and not abandon you in fog. While the book does not shirk mathematics, I have tried hard to make the logic and derivations as clear and tight as possible. So much for the overall goals of this book. What about the strategy to achieve those goals? Firstly, this book is intended as a book from which a student can learn, and a lecturer can teach. It is not intended as a reference book. In a learning/teaching book, one must choose carefully not only what to include, but also what not to include, because the latter distracts and dilutes. The grand strategy is to go through general relativity in two passes. In the first pass, the aim is to run through the foundations of general relativity, and to get to ideal black holes as quickly as possible. In the second pass, the book essentially starts all over again, using a tetrad-based approach rather than a coordinate-based approach. The tetrad approach provides the basis for the subsequent treatment of non-ideal black holes, and of the cosmic microwave background. The emphasis of (the second half of) this book on tetrads is unusual for a textbook, but consistent with its right-brained emphasis. If you want to see what’s happening in a spacetime, then you need to look at it with respect to the frame of an observer, which means working in a locally inertial (orthonormal) frame. The problem with coordinate frames is that they prescribe that the axes of the spacetime are the tangent vectors to the coordinates. These tangent vectors are skewed, not orthonormal. Looking at things in a coordinate frame is like looking at a scene with eyes crossed. Even with geometries as simple as the Friedmann-Robertson-Walker geometry of homogeneous, isotropic cosmology, it is necessary to play games to see plainly what its energy-momentum is (for FRW, raise one of the indices on the energy-momentum tensor — but that trick fails in more complicated spacetimes). Tetrads obviate the need to waste time attempting to conceptualize the distinction between vectors and covectors (one-forms). My own suspicion is that the locally flat structure of general relativity may be more fundamental than its globally geometric character, which could be an emergent phenomenon. A virtue of the two-pass approach is that the student gets to revisit the fundamentals of general relativity from two similar but not identical perspectives. This reaffirmation of fundamentals is especially important given the fast pace and stripped-down coverage. The course that I teach to senior physics undergraduates and beginning graduate students at the University of Colorado covers the following 8 topics, each topic taking about 2 weeks during the 16-week semester: 1. Pass 1. a. Chapter 1: Special relativity.

4

Preface

b. Chapter 2: Coordinate approach to general relativity. c. Chapters 5–9: Ideal black holes, namely Schwarzschild, Reissner-Nordstr¨ om, and Kerr-Newman. d. Chapter 10: Homogeneous, isotropic cosmology. 2. Pass 2. a. Chapter 11: Tetrad approach to general relativity. b. Chapters 15–17: Black hole interiors. c. Chapters 18–20: General relativistic perturbation theory. d. Chapters 21, 23, and 25: Cosmological perturbations. The first of the eight topics is special relativity. Special relativity is an essential precursor to general relativity, since a fundamental postulate of general relativity is the Principle of Equivalence, which asserts that at any point there exist frames, called locally inertial, or free-fall, with respect to which special relativity operates locally. The strategy of Chapter 1 on Special Relativity is first to confront the paradox of the constancy of the speed of light, and from there to proceed rapidly to the highlight of special relativity, the question of what you see and experience when you pass through a scene at near the speed of light. I choose not to pause to discuss electromagnetism, actions, or other important topics in special relativity, since that would get in the way of the driving goal, to head to a black hole at the fastest possible pace1 . The second of the eight topics is what I call the coordinate approach to general relativity. This is a lightning introduction to the fundamental ingredients of general relativity, from the metric through to the Einstein tensor, using the traditional coordinate-based approach, where components of tensors are expressed relative to a basis of coordinate tangent vectors. To make the material more accessible, and to lay the groundwork for tetrads, the book builds on concepts of vectors familiar from high school, and avoids unncecessary mathematical distractions, such as emphasizing the distinction between vectors and 1-forms. Typically, texts go through this material at a more leisurely pace, taking time to convey challenging conceptual issues. Here however I choose not to linger, for two reasons. The first is the obvious one: the goal is to get to black holes post-haste. The second reason is that, as mentioned earlier in this preface, looking at tensors in a coordinate basis is like looking at the world with eyes crossed. As my mother used to say, “If you do that and the wind changes, you’ll be stuck like that forever.” The third topic is ideal black holes, and here the pace slows. An ideal black hole is one that is stationary (time translation invariant), and empty outside its singularity, except for the contribution of a static electric field. In the 4 dimensions of the spacetime we live in, ideal black holes come in just a few varieties: the Schwarzschild geometry for a spherical, uncharged black hole, the Reissner-Nordstr¨ om for a spherical, charged black hole, and the Kerr-Newman geometry for a rotating, charged black hole. The fourth topic is homogeneous, isotropic cosmology, the Friedmann-Robertson-Walker (FRW) geometry. The FRW geometry forms the essential background spacetime for the cosmological perturbation theory to be encountered later. The book now enters the second pass. Tetrads — systems of locally inertial (or other) frames attached to 1

A strategy that might fail for students like myself. I learned special relativity from L. D. Landau & E. M. Lifshitz’s incomparable “The Classical Theory of Fields.” I recall vividly the extraordinary delight in discovering a text that, in contrast to those dreadful books that conveyed the idea that electromagnetism was something to do with resistors and capacitors, put relativity, Maxwell’s equations, and actions up front.

Preface

5

each point of spacetime — are well known to, and widely used by, general relativists. The tetrad approach to general relatiivity is more complicated than the coordinate approach in that it requires an additional superstructure. However, the advantage of being able to see straight, because you are working in an orthonormal frame, outweighs the disadvantage of the additional overhead. While the coordinate approach is adequate for simple spacetimes — ideal black holes, and the FRW geometry — its defects are a barrier to understanding more complicated spacetimes. Most texts do not cover tetrads, or cover them as an aside. In this book, tetrads are developed systematically, in one self-contained chapter. The sixth topic is black hole interiors. Cool, but needs more work. The seventh topic is general relativistic perturbation theory, some understanding of which is prerequisite for dealing with cosmological perturbations and the CMB. The approach starts — in a thankfully short chapter — with one of the most difficult aspects of general relativistic perturbation theory, namely the problem of coordinate and tetrad gauge ambiguities. This might seem a peculiar starting point. A more typical starting point is to vary the metric, pick a gauge, and lo there are waves. However, I think that it is important to show how, at least in flat or FRW background spacetimes, the real physical perturbations emerge naturally from the formalism, without having to pick a gauge. In flat spacetime, the formalism picks out one particular gauge, the Newtonian gauge (though I think it should be called the Copernican gauge, because in the solar system it would pick out a Sun-centered almost-Cartesian frame), in which the perturbations retained are precisely the physical perturbations and no others. The Newtonian/Copernican gauge provides the natural arena for elucidating general relativistic phenomena such as the dragging of inertial frames, and gravitational waves. The eighth and final topic is cosmological perturbation theory, emphasizing the calculation of fluctuations in the CMB. This was one of the most challenging parts of the book to write, because of the shear volume of physics that goes into the calculation. I did my best to condense the core of the calculation into “a simplest set of assumptions,” Chapter 23. However, if you want to calculate a CMB power spectrum that you can actually compare to observations, then you’ll have to go beyond the “simple” chapter. Subsequent chapters will help you do that. Beyond the eight topics described above, there are several chapters, all starred2 , that contain relevant, but lower priority, material. The material contains some fun stuff, my favourite being “How to implement Lorentz transformations on a computer,” §14.20.

2

The idea of starred chapters comes from Steven L. Weinberg’s classic 1972 text “Gravitation and Cosmology,” from which I learned general relativity. Curiously, I found his starred chapters often more interesting than the unstarred ones.

Notation

Except where actual units are needed, units are such that the speed of light is one, c = 1, and Newton’s gravitational constant is one, G = 1. The metric signature is −+++. Greek (brown) letters α, β, ..., denote dummy 4D coordinate indices. Latin (black) letters a, b, ..., denote dummy 4D tetrad indices. Mid-alphabet Latin letters i, j, ... denote 3D indices, either coordinate (brown) or tetrad (black). To avoid distraction, colouring is applied only to coordinate indices, not to the coordinates themselves. Specific (non-dummy) components of a vector are labelled by the corresponding coordinate (brown) or tetrad (black) direction, for example Aµ = {At , Ax , Ay , Az } or Am = {At , Ax , Ay , Az }. Allowing the same label to denote either a coordinate or a tetrad index risks ambiguity, but it should apparent from the context what is meant. Some texts distinguish coordinate and tetrad indices for example by a caret on the latter, but this produces notational overload. Boldface denotes abstract vectors, in either 3D or 4D. In 4D, A = Aµ gµ = Am γm , where gµ denote coordinate tangent axes, and γm denote tetrad axes. Repeated paired dummy indices are summed over, the implicit summation convention. In special and general relativity, one index of a pair must be up (contravariant), while the other must be down (covariant). If the space being considered is Euclidean, then both indices may be down. ∂/∂xµ denotes coordinate partial derivatives, which commute. ∂m denotes tetrad directed derivatives, which do not commute. Dµ and Dm denote respectively coordinate-frame and tetrad-frame covariant derivatives.

Choice of metric signature There is a tendency, by no means unanimous, for general relativists to prefer the −+++ metric signature, while particle physicists prefer +−−−. For someone like me who does general relativistic visualization, there is no contest: the choice has to be −+++, so that signs remain consistent between 3D spatial vectors and 4D spacetime vectors. For example,

Notation

7

the 3D industry knows well that quaternions provide the most efficient and powerful way to implement spatial rotations. As shown in Chapter 14, complex quaternions provide the best way to implement Lorentz transformations, with the subgroup of real quaternions continuing to provide spatial rotations. Compatibility requires −+++. Actually, OpenGL and other graphics languages put spatial coordinates in the first three indices, leaving time to occupy the fourth index; but in these notes I stick to the physics convention of putting time in the zeroth index. In practical calculations it is convenient to be able to switch transparently between boldface and index notation in both 3D and 4D contexts. This is where the +−−− signature poses greater potential for misinterpretation in 3D. For example, with this signature, what is the sign of the 3D scalar product

P3

P3

a·b ?

(0.1)

i i i Is it a · b = i=1 a b ? To be consistent with common 3D usage, it must be the i=1 ai b or a · b = latter. With the +−−− signature, it must be that a · b = −ai bi , where the repeated indices signify implicit summation over spatial indices. So you have to remember to introduce a minus sign in switching between boldface and index notation. As another example, what is the sign of the 3D vector product

P3

P3

a×b ?

(0.2) P3

Is it a×b = jk=1 εijk aj bk or a×b = jk=1 εi jk aj bk or a×b = jk=1 εijk aj bk ? Well, if you want to switch transparently between boldface and index notation, and you decide that you want boldface consistently to signify a vector with a raised index, then maybe you’d choose the middle option. To be consistent with standard 3D convention for the sign of the vector product, maybe you’d choose εi jk to have positive sign for ijk an even permutation of xyz. Finally, what is the sign of the 3D spatial gradient operator ∇≡

∂ ? ∂x

(0.3)

Is it ∇ = ∂/∂xi or ∇ = ∂/∂xi ? Convention dictates the former, in which case it must be that some boldface 3D vectors must signify a vector with a raised index, and others a vector with a lowered index. Oh dear.

PART ONE SPECIAL RELATIVITY

Concept Questions

1. What does c = universal constant mean? What is speed? What is distance? What is time? 2. c + c = c. How can that be possible? 3. The first postulate of special relativity asserts that spacetime forms a 4-dimensional continuum. The fourth postulate of special relativity asserts that spacetime has no absolute existence. Isn’t that a contradiction? 4. The principle of special relativity says that there is no absolute spacetime, no absolute frame of reference with respect to which position and velocity are defined. Yet does not the cosmic microwave background define such a frame of reference? 5. How can two people moving relative to each other at near c both think each other’s clock runs slow? 6. How can two people moving relative to each other at near c both think the other is Lorentz-contracted? 7. All paradoxes in special relativity have the same solution. In one word, what is that solution? 8. All conceptual paradoxes in special relativity can be understood by drawing what kind of diagram? 9. Your twin takes a trip to α Cen at near c, then returns to Earth at near c. Meeting your twin, you see that the twin has aged less than you. But from your twin’s perspective, it was you that receded at near c, then returned at near c, so your twin thinks you aged less. Is it true? 10. Blobs in the jet of the galaxy M87 have been tracked by the Hubble Space Telescope to be moving at about 6c. Does this violate special relativity? 11. If you watch an object move at near c, does it actually appear Lorentz-contracted? Explain. 12. You speed towards the center of our Galaxy, the Milky Way, at near c. Does the center appear to you closer or farther away? 13. You go on a trip to the center of the Milky Way, 30,000 lightyears distant, at near c. How long does the trip take you? 14. You surf a light ray from a distant quasar to Earth. How much time does the trip take, from your perspective? 15. If light is a wave, what is waving? 16. As you surf the light ray, how fast does it appear to vibrate? 17. How does the phase of a light ray vary along the light ray? Draw surfaces of constant phase on a spacetime diagram.

12

Concept Questions

18. You see a distant galaxy at a redshift of z = 1. If you could see a clock on the galaxy, how fast would the clock appear to tick? Could this be tested observationally? 19. You take a trip to α Cen at near c, then instantaneously accelerate to return at near c. If you are looking through a telescope at a clock on the Earth while you instantaneously accelerate, what do you see happen to the clock? 20. In what sense is time an imaginary spatial dimension? 21. In what sense is a Lorentz boost a rotation by an imaginary angle? 22. You know what it means for an object to be rotating at constant angular velocity. What does it mean for an object to be boosting at a constant rate? 23. A wheel is spinning so that its rim is moving at near c. The rim is Lorentz-contracted, but the spokes are not. How can that be? 24. You watch a wheel rotate at near the speed of light. The spokes appear bent. How can that be? 25. Does a sunbeam appear straight or bent when you pass by it at near the speed of light? 26. Energy and momentum are unified in special relativity. Explain. 27. In what sense is mass equivalent to energy in special relativity? In what sense is mass different from energy? 28. Why is the Minkowski metric unchanged by a Lorentz transformation? 29. What is the best way to program Lorentz transformations on a computer?

What’s important in Special Relativity

See http://casa.colorado.edu/~ajsh/sr/ 1. Postulates of special relativity. 2. Understanding conceptually the unification of space and time implied by special relativity. a. Spacetime diagrams. b. Simultaneity. c. Understanding the paradoxes of relativity — time dilation, Lorentz contraction, the twin paradox. 3. The mathematics of spacetime transformations a. Lorentz transformations. b. Invariant spacetime distance. c. Minkowski metric. d. 4-vectors. e. Energy-momentum 4-vector. E = mc2 . f. The energy-momentum 4-vector of massless particles, such as photons. 4. What things look like at relativistic speeds.

1 Special Relativity

1.1 The postulates of special relativity The theory of special relativity can be derived formally from a small number of postulates: 1. Space and time form a 4-dimensional continuum: 2. The existence of locally inertial frames; 3. The speed of light is constant; 4. The principle of special relativity. The first two postulates are assertions about the structure of spacetime, while the last two postulates form the heart of special relativity. Most books mention just the last two postulates, but I think it is important to know that special (and general) relativity simply postulate the 4-dimensional character of spacetime, and that special relativity postulates moreover that spacetime is flat. 1. Space and time form a 4-dimensional continuum. The correct mathematical word for continuum is manifold. A 4-dimensional manifold is defined mathematically to be a topological space that is locally homeomorphic to Euclidean 4-space R4 . The postulate that spacetime forms a 4-dimensional continuum is a generalization of the classical Galilean concept that space and time form separate 3 and 1 dimensional continua. The postulate of a 4-dimensional spacetime continuum is retained in general relativity. Physicists widely believe that this postulate must ultimately breakpdown, that space and time are quantized over small intervals of space and time, the Planck length G~/c3 ≈ 10−35 m, and the Planck time p extremely−43 5 G~/c ≈ 10 s, where G is Newton’s gravitational constant, ~ ≡ h/(2π) is Planck’s constant divided by 2π, and c is the speed of light. 2. The existence of globally inertial frames. Statement: “There exist global spacetime frames with respect to which unaccelerated objects move in straight lines at constant velocity.” A spacetime frame is a system of coordinates for labelling space and time. Four coordinates are needed, because spacetime is 4-dimensional. A frame in which unaccelerated objects move in straight lines at constant

1.1 The postulates of special relativity

15

velocity is called an inertial frame. One can easily think of non-inertial frames: a rotating frame, an accelerating frame, or simply a frame with some bizarre Dahlian labelling of coordinates. A globally inertial frame is an inertial frame that covers all of space and time. The postulate that globally inertial frames exist is carried over from classical mechanics (Newton’s first law of motion). Notice the subtle shift from the Newtonian perspective. The postulate is not that particles move in straight lines, but rather that there exist spacetime frames with respect to which particles move in straight lines. Implicit in the assumption of the existence of globally inertial frames is the assumption that the geometry of spacetime is flat, the geometry of Euclid, where parallel lines remain parallel to infinity. In general relativity, this postulate is replaced by the weaker postulate that local (not global) inertial frames exist. A locally inertial frame is one which is inertial in a “small neighbourhood” of a spacetime point. In general relativity, spacetime can be curved. 3. The speed of light is constant. Statement: “The speed of light c is a universal constant, the same in any inertial frame.” This postulate is the nub of special relativity. The immediate challenge of this chapter, §1.2, is to confront its paradoxical implications, and to resolve them. Measuring speed requires being able to measure intervals of both space and time: speed is distance travelled divided by time elapsed. Inertial frames constitute a special class of spacetime coordinate systems; it is with respect to distance and time intervals in these special frames that the speed of light is asserted to be constant. In general relativity, arbitrarily weird coordinate systems are allowed, and light need move neither in straight lines nor at constant velocity with respect to bizarre coordinates (why should it, if the labelling of space and time is totally arbitrary?). However, general relativity asserts the existence of locally inertial frames, and the speed of light is a universal constant in those frames. In 1983, the General Conference on Weights and Measures officially defined the speed of light to be c ≡ 299,792,458 m s−1,

(1.1)

and the meter, instead of being a primary measure, became a secondary quantity, defined in terms of the second and the speed of light. 4. The principle of special relativity. Statement: “The laws of physics are the same in any inertial frame, regardless of position or velocity.” Physically, this means that there is no absolute spacetime, no absolute frame of reference with respect to which position and velocity are defined. Only relative positions and velocities between objects are meaningful. It is to be noted that the principle of special relativity does not imply the constancy of the speed of light, although the postulates are consistent with each other. Moreover the constancy of the speed of light does not imply the Principle of Special Relativity, although for Einstein the former appears to have been the inspiration for the latter. An example of the application of the principle of special relativity is the construction of the energymomentum 4-vector of a particle, which should have the same form in any inertial frame (§1.10).

16

Special Relativity

1.2 The paradox of the constancy of the speed of light The postulate that the speed of light is the same in any inertial frame leads immediately to a paradox. Resolution of this paradox compels a revolution in which space and time are united from separate 3 and 1-dimensional continua into a single 4-dimensional continuum. Here, Figure ??, is Vermilion. She emits a flash of light. Vermilion thinks that the light moves outward at the same speed in all directions. So Vermilion thinks that she is at the centre of the expanding sphere of light. But here also, Figure ??, is Cerulean, moving away from Vermilion, at about 12 the speed of light. Vermilion thinks that she is at the centre of the expanding sphere of light, as before. But, says special relativity, Cerulean also thinks that the light moves outward at the same speed in all directions from him. So Cerulean should be at the centre of the expanding light sphere too. But he’s not, is he. Paradox! Concept question 1.1

Would the light have expanded differently if Cerulean had emitted the light?

1.2.1 Challenge Can you figure out Einstein’s solution to the paradox? Somehow you have to arrange that both Vermilion and Cerulean regard themselves as being in the centre of the expanding sphere of light.

1.2.2 Spacetime diagram A spacetime diagram suggests a way of thinking which leads to the solution of the paradox of the constancy of the speed of light. Indeed, spacetime diagrams provide the way to resolve all conceptual paradoxes in special relativity, so it is thoroughly worthwhile to understand them. A spacetime diagram, Figure ??, is a diagram in which the vertical axis represents time, while the horizontal axis represents space. Really there are three dimensions of space, which can be thought of as filling additional horizontal dimensions. But for simplicity a spacetime diagram usually shows just one spatial dimension. In a spacetime diagram, the units of space and time are chosen so that light goes one unit of distance in one unit of time, i.e. the units are such that the speed of light is one, c = 1. Thus light always moves upward at 45◦ from vertical in a spacetime diagram. Each point in 4-dimensional spacetime is called an event. Light signals converging to or expanding from an event follow a 3-dimensional hypersurface called the lightcone. Light converging on to an event in on the past lightcone, while light emerging from an event is on the future lightcone. Here is a spacetime diagram of Vermilion emitting a flash of light, and Cerulean moving relative to Vermilion at about 21 the speed of light. This is a spacetime diagram version of the situation illustrated in Figure ??. The lines along which Vermilion and Cerulean move through spacetime are called their worldlines. Consider again the challenge problem. The problem is to arrange that both Vermilion and Cerulean are at the centre of the lightcone, from their own points of view.

1.3 Paradoxes and simultaneity

17

Here’s a clue. Cerulean’s concept of space and time may not be the same as Vermilion’s.

1.2.3 Centre of the lightcone Einstein’s solution to the paradox is that Cerulean’s spacetime is skewed compared to Vermilion’s, as illustrated by Figure ??. The thing to notice in the diagram is that Cerulean is in the centre of the lightcone, according to the way Cerulean perceives space and time. Vermilion remains at the centre of the lightcone according to the way Vermilion perceives space and time. In the diagram Vermilion and her space are drawn at one “tick” of her clock past the point of emission, and likewise Cerulean and his space are drawn at one “tick” of his identical clock past the point of emission. Of course, from Cerulean’s point of view his spacetime is quite normal, and it’s Vermilion’s spacetime that is skewed. In special relativity, the transformation between the spacetime frames of two inertial observers is called a Lorentz transformation. In general, a Lorentz transformation consists of a spatial rotation about some spatial axis, combined with a Lorentz boost by some velocity in some direction. Only space along the direction of motion gets skewed with time. Distances perpendicular to the direction of motion remain unchanged. Why must this be so? Consider two hoops which have the same size when at rest relative to each other. Now set the hoops moving towards each other. Which hoop passes inside the other? Neither! For suppose Vermilion thinks Cerulean’s hoop passed inside hers; by symmetry, Cerulean must think Vermilion’s hoop passed inside his; but both cannot be true; the only possibility is that the hoops remain the same size in directions perpendicular to the direction of motion. Cottoned on? Then you have understood the crux of special relativity, and you can now go away and figure out all the mathematics of Lorentz transformations. Just like Einstein. The mathematical problem is: what is the relation between the spacetime coordinates {t, x, y, z} and {t′ , x′ , y ′ , z ′ } of a spacetime interval, a 4-vector, in Vermilion’s versus Cerulean’s frames, if Cerulean is moving relative to Vermilion at velocity v in, say, the x direction? The solution follows from requiring 1. that both observers consider themselves to be at the centre of the lightcone, and 2. that distances perpendicular to the direction of motion remain unchanged, as illustrated by Figure ??. (An alternative version of the second condition is that a Lorentz transformation at velocity v followed by a Lorentz transformation at velocity −v should yield the unit transformation.) Note that the postulate of the existence of globally inertial frames implies that Lorentz transformations are linear, that straight lines (4-vectors) in one inertial spacetime frame transform into straight lines in other inertial frames. You will solve this problem in the next section but two, §1.5. As a prelude, the next two sections, §§1.3 and 1.4 discuss simultaneity and time dilation.

1.3 Paradoxes and simultaneity Most (all?) of the apparent paradoxes of special relativity arise because observers moving at different velocities relative to each other have different notions of simultaneity.

18

Special Relativity

1.3.1 Operational definition of simultaneity How can simultaneity, the notion of events ocurring at the same time at different places, be defined operationally? One way is illustrated in Figure ??. Vermilion surrounds herself with a set of mirrors, equidistant from Vermilion. She sends out a flash of light, which reflects off the mirrors back to Vermilion. How does Vermilion know that the mirrors are all the same distance from her? Because the relected flash returns from the mirrors to Vermilion all at the same instant. Vermilion asserts that the light flash must have hit all the mirrors simultaneously. Vermilion also asserts that the instant when the light hit the mirrors must have been the instant, as registered by her wristwatch, precisely half way between the moment she emitted the flash and the moment she received it back again. If it takes, say, 2 seconds between flash and receipt, then Vermilion concludes that the mirrors are 1 lightsecond away from her. Figure ?? shows a spacetime diagram of Vermilion’s mirror experiment above. According to Vermilion, the light hits the mirrors everywhere at the same instant, and the spatial hyperplane passing through these events is a hypersurface of simultaneity. More generally, from Vermilion’s perspective, each horizontal hyperplane in the spacetime diagram is a hypersurface of simultaneity. Cerulean defines surfaces of simultaneity using the same operational setup: he encompasses himself with mirrors, arranging them so that a flash of light returns from them to him all at the same instant. But whereas Cerulean concludes that his mirrors are all equidistant from him and that the light bounces off them all at the same instant, Vermilion thinks otherwise. From Vermilion’s point of view, the light bounces off Cerulean’s mirrors at different times and moreover at different distances from Cerulean. Only so can the speed of light be constant, as Vermilion sees it, and yet the light return to Cerulean all at the same instant. Of course from Cerulean’s point of view all is fine: he thinks his mirrors are equidistant from him, and that the light bounces off them all at the same instant. The inevitable conclusion is that Cerulean must measure space and time along axes that are skewed relative to Vermilion’s. Events that happen at the same time according to Cerulean happen at different times according to Vermilion; and vice versa. Cerulean’s hypersurfaces of simultaneity are not the same as Vermilion’s. From Cerulean’s point of view, Cerulean remains always at the centre of the lightcone. Thus for Cerulean, as for Vermilion, the speed of light is constant, the same in all directions.

1.4 Time dilation Vermilion and Cerulean construct identical clocks, consisting of a light beam which bounces off a mirror. Tick, the light beam hits the mirror, tock, the beam returns to its owner. As long as Vermilion and Cerulean remain at rest relative to each other, both agree that each other’s clock tick-tocks at the same rate as their own. But now suppose Cerulean goes off at velocity v relative to Vermilion, in a direction perpendicular to the direction of the mirror. A far as Cerulean is concerned, his clock tick-tocks at the same rate as before, a tick

1.4 Time dilation

19

at the mirror, a tock on return. But from Vermilion’s point of view, although the distance between Cerulean and his mirror at any instant remains the same as before, the light has further to go. And since the speed of light is constant, Vermilion thinks it takes longer for Cerulean’s clock to tick-tock than her own. Thus Vermilion thinks Cerulean’s clock runs slow relative to her own.

1.4.1 Lorentz gamma factor How much slower does Cerulean’s clock run, from Vermilion’s point of view? In special relativity the factor is called the Lorentz gamma factor γ, introduced by the Dutch physicist Hendrik A. Lorentz in 1904, one year before Einstein proposed his theory of special relativity. Let us see how the Lorentz gamma factor is related to Cerulean’s velocity v. In units where the speed of light is one, c = 1, Vermilion’s mirror is one tick away from her, and from her point of view the vertical distance between Cerulean and his mirror is the same, one tick. But Vermilion thinks that the distance travelled by the light beam between Cerulean and his mirror is γ ticks. Cerulean is moving at speed v, so Vermilion thinks he moves a distance of γv ticks during the γ ticks of time taken by the light to travel from Cerulean to his mirror. Thus, from Vermilion’s point of view, the vertical line from Cerulean to his mirror, Cerulean’s light beam, and Cerulean’s path form a triangle with sides 1, γ, and γv, as illustrated. Pythogoras’ theorem implies that 12 + (γv)2 = γ 2 .

(1.2)

From this it follows that the Lorentz gamma factor γ is related to Cerulean’s velocity v by γ=√

1 , 1 − v2

(1.3)

which is Lorentz’s famous formula.

1.4.2 Paradox Vermilion thinks Cerulean’s clock runs slow, by the Lorentz factor γ. But of course from Cerulean’s perspective it is Vermilion who is moving, and Vermilion whose clock runs slow. How can both think the other’s clock runs slow? Paradox! The resolution of the paradox, as usual in special relativity, involves simultaneity, and as usual it helps to draw a spacetime diagram, such as the one in Figure ??. While Vermilion thinks events happen simultaneously along horizontal planes in this diagram, Cerulean thinks events occur simultaneously along skewed planes. Thus Vermilion thinks her clock ticks when Cerulean is at the point before Cerulean’s clock ticks. Conversely, Cerulean thinks his clock ticks when Vermilion is at the point before Vermilion’s clock ticks. Where do the two light beams in Vermilion’s and Cerulean’s clocks go in this spacetime diagram? Figure ?? shows a 3D spacetime diagram.

20

Special Relativity

Concept question 1.2 Figure ?? shows a picture of a 3D cube. Is one edge shorter than the other? Projected on to the page, it appears so, but in reality all the edges have equal length. In what ways is this situation similar or disimilar to time dilation in 4D relativity?

1.5 Lorentz transformation A Lorentz transformation is a rotation of space and time. Lorentz transformations form a 6-dimensional group, with 3 dimensions from spatial rotations, and 3 dimensions from Lorentz boosts. If you wish to understand special relativity mathematically, then it is essential for you to go through the exercise of deriving the form of Lorentz transformations for yourself. Indeed, this problem is the challenge problem posed in §1.2, recast as a mathematical exercise. For simplicity, it is enough to consider the case of a Lorentz boost by velocity v along the x-axis. You can derive the form of a Lorentz transformation either pictorially (geometrically), or algebraically. Ideally you should do both. Exercise 1.3 Pictorial derivation of the Lorentz transformation. Construct, with ruler and compass, a spacetime diagram that looks like the one in Figure ??. You should recognize that the square represents the paths of lightrays that Vermilion uses to define a hypersurface of simultaneity, while the rectangle represents the same thing for Cerulean. Notice that Cerulean’s worldline and line of simultaneity are diagonals along his light rectangle, so the angles between those lines and the lightcone are equal. Notice also that the areas of the square and the rectangle are the same, which expresses the fact that the area is multiplied by the determinant of the Lorentz transformation matrix, which must be one (why?). Use your geometric construction to derive the mathematical form of the Lorentz transformation. Exercise 1.4 3D model of the Lorentz transformation. Make a 3D spacetime diagram of the Lorentz transformation, with not only an x-dimension, as in the previous problem, but also a y-dimension. Resist the temptation to use a 3D computer modelling program. Believe me, you will learn much more from hands-on model-making. Make the lightcone from flexible paperboard, the spatial hypersurface of simultaneity from stiff paperboard, and the worldline from wooden dowel. Exercise 1.5 Mathematical derivation of the Lorentz transformation. Relative to person A (Vermilion, unprimed frame), person B (Cerulean, primed frame) moves at velocity v along the x-axis. Derive the form of the Lorentz transformation between the coordinates (t, x, y, z) of a 4-vector in A’s frame and the corresponding coordinates (t′ , x′ , y ′ , z ′ ) in B’s frame from the assumptions: 1. that the transformation is linear; 2. that the spatial coordinates in the directions orthogonal to the direction of motion are unchanged; 3. that the speed of light c is the same for both A and B, so that x = t in A’s frame transforms to x′ = t′ in B’s frame, and likewise x = −t in A’s frame transforms to x′ = −t′ in B’s frame; 4. the definition of speed; if B is moving at speed v relative to A, then x = vt in A’s frame transforms to x′ = 0 in B’s frame;

1.5 Lorentz transformation

21

5. spatial isotropy; specifically, show that if A thinks B is moving at velocity v, then B must think that A is moving at velocity −v, and symmetry (spatial isotropy) between these two situations then fixes the Lorentz γ factor. Your logic should be precise, and explained in clear, concise English. You should find that the Lorentz transformation for a Lorentz boost by velocity v along the x-axis is t′ x′ y′ z′

= = = =

γt − γvx − γvt + γx , y z

t x y z

= = = =

γt′ + γvx′ γvt′ + γx′ y′ z′

The transformation can be written elegantly in matrix notation:  ′    t γ −γv 0 0  x′   −γv  γ 0 0       y′  =  0   0 1 0 ′ z 0 0 0 1 with inverse

  γ t  x   γv     y = 0 0 z 

γv γ 0 0

0 0 1 0

 ′ t 0  x′ 0   0   y′ 1 z′

.

 t x   , y  z 

  . 

(1.4)

(1.5)

(1.6)

A Lorentz transformation at velocity v followed by a Lorentz transformation at velocity v in the opposite direction, i.e. at velocity −v, yields the unit transformation, as it should:      γ γv 0 0 γ −γv 0 0 1 0 0 0   γv γ 0 0   −γv  γ 0 0    = 0 1 0 0  . (1.7)  0     0 0 1 0  0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 The determinant of the Lorentz transformation γ −γv −γv γ 0 0 0 0

is one, as it should be: 0 0 0 0 = γ 2 (1 − v 2 ) = 1 . 1 0 0 1

(1.8)

Indeed, requiring that the determinant be one provides another derivation of the formula (1.3) for the Lorentz gamma factor. Concept question 1.6

Why must the determinant of a Lorentz transformation be one?

22

Special Relativity

1.6 Paradoxes: Time dilation, Lorentz contraction, and the Twin paradox There are several classic paradoxes in special relativity. Two of them have already been met above: the paradox of the constancy of the speed of light in §1.2, and the paradox of time dilation in §1.4. This section collects three famous paradoxes: time dilation (reiterating §1.4), Lorentz contraction, and the Twin paradox. If you wish to understand special relativity conceptually, then you should work through all these paradoxes yourself. As remarked in §1.3, most (all?) paradoxes in special relativity arise because different observers have different notions of simultaneity, and most (all?) paradoxes can be solved using spacetime diagrams. The Twin paradox is particularly helpful because it illustrates several different facets of special relativity, not only time dilation, but also how light travel time modifies what an observer actually sees.

1.6.1 Time dilation If a timelike interval {t, r} corresponds to motion at velocity v, then r = vt. The proper time along the interval is p p t (1.9) τ = t2 − r 2 = t 1 − v 2 = . γ

This is Lorentz time dilation: the proper time interval τ experienced by a moving person is a factor γ less than the time interval t according to an onlooker.

Exercise 1.7 On a spacetime diagram, show how two observers moving relative to each other can both consider the other’s clock to run slow compared to their own.

1.6.2 Fitzgerald-Lorentz contraction Consider a rocket of proper length l, so that in the rocket’s own rest frame (primed) the back and front ends of the rocket move through time t′ with coordinates {t′ , x′ } = {t′ , 0} and {t′ , l} .

(1.10)

From the perspective of an observer who sees the rocket move at velocity v in the x-direction, the worldlines of the back and front ends of the rocket are at {t, x} = {γt′ , γvt′ } and {γt′ + γvl, γvt′ + γl} .

(1.11)

However, the observer measures the length of the rocket simultaneously in their own frame, not the rocket frame. Solving for γt′ = t at the back and γt′ + γvl = t at the front gives   l (1.12) {t, x} = {t, vt} and t, vt + γ which says that the observer measures the front end of the rocket to be a distance l/γ ahead of the back end. This is Lorentz contraction: an object of proper length l is measured by a moving person to be shorter by a factor γ.

1.7 The spacetime wheel

23

Exercise 1.8 On a spacetime diagram, show how two observers moving relative to each other can both consider the other to be contracted along the direction of motion.

1.6.3 Twin paradox See Exercise 1.11 at the end of the chapter.

1.7 The spacetime wheel 1.7.1 3D wheel Figure ?? shows an ordinary 3D wheel. As the wheel rotates, a point on the wheel describes an invariant circle. The coordinates {x, y} of a point on the wheel relative to its centre change, but the distance r between the point and the centre remains constant r2 = x2 + y 2 = constant .

(1.13)

More generally, the coordinates {x, y, z} of the interval between any two points in 3-dimensional space (a vector) change when the coordinate system is rotated in 3 dimensions, but the separation r of the two points remains constant r2 = x2 + y 2 + z 2 = constant .

(1.14)

1.7.2 4D spacetime wheel Figure ?? shows a 4D spacetime wheel. The diagram here is a spacetime diagram, with time t vertical and space x horizontal. A rotation between time t and space x is a Lorentz boost in the x-direction. As the spacetime wheel boosts, a point on the wheel describes an invariant hyperbola. The spacetime coordinates {t, x} of a point on the wheel relative to its centre change, but the spacetime separation s between the point and the centre remains constant s2 = − t2 + x2 = constant .

(1.15)

More generally, the coordinates {t, x, y, z} of the interval between any two events in 4-dimensional spacetime (a 4-vector) change when the coordinate system is boosted or rotated, but the spacetime separation s of the two events remains constant s2 = − t2 + x2 + y 2 + z 2 = constant .

(1.16)

1.7.3 Lorentz boost as a rotation by an imaginary angle The − sign instead of a + sign in front of the t2 in the spacetime separation formula (1.16) means that time t can often be treated mathematically as if it were an imaginary spatial dimension. That is, t = iw where √ i ≡ −1 and w is a “fourth spatial coordinate”.

24

Special Relativity

A Lorentz boost by a velocity v can likewise be treated as a rotation by an imaginary angle. Consider a normal spatial rotation in which a primed frame is rotated in the wx-plane clockwise by an angle a about the origin, relative to the unprimed frame. The relation between the coordinates {w′ , x′ } and {w, x} of a point in the two frames is  ′     w cos a − sin a w = . (1.17) x′ sin a cos a x Now set t = iw and α = ia with t and α both real. In other words, take the spatial coordinate w to be imaginary, and the rotation angle a likewise to be imaginary. Then the rotation formula above becomes  ′     t t cosh α − sinh α (1.18) = x − sinh α cosh α x′ This agrees with the usual Lorentz transformation formula (??) if the boost velocity v and boost angle α are related by v = tanh α ,

(1.19)

so that γ = cosh α ,

γv = sinh α .

(1.20)

This provides a convenient way to add velocities in special relativity: the boost angles simply add (for boosts along the same direction), just as spatial rotation angles add (for rotations about the same axis). Thus a boost by velocity v1 = tanh α1 followed by a boost by velocity v2 = tanh α2 in the same direction gives a net velocity boost of v = tanh α where α = α1 + α2 .

(1.21)

The equivalent formula for the velocities themselves is v=

v1 + v2 , 1 + v1 v2

(1.22)

the special relativistic velocity addition formula.

1.7.4 Trip across the Universe at constant acceleration Suppose that you took a trip across the Universe in a spaceship, accelerating all the time at one Earth gravity g. How far would you travel in how much time? The spacetime wheel offers a cute way to solve this problem, since the rotating spacetime wheel can be regarded as representing spacetime frames undergoing constant acceleration. Specifically, points on the right quadrant of the rotating spacetime wheel represent worldlines of persons who accelerate with constant acceleration in their own frame. If the units of space and time are chosen so that the speed of light and the gravitational acceleration are both one, c = g = 1, then the proper time experienced by the accelerating person is the boost angle α, and

1.7 The spacetime wheel

25

Table 1.1 Trip across the Universe. Time elapsed on spaceship in years

Time elapsed on Earth in years

Distance travelled in lightyears

α

sinh α

cosh α − 1

0 1 2 2.337 3.962 6.60 10.9 15.4 18.4 19.2 25.3

0 1.175 3.627 5.127 26.3 368 2.7 × 104 2.44 × 106 4.9 × 107 1.1 × 108 5 × 1010

0 .5431 2.762 4.223 25.3 367 2.7 × 104 2.44 × 106 4.9 × 107 1.1 × 108 5 × 1010

To

Earth (starting point) Proxima Cen Vega Pleiades Centre of Milky Way Andromeda galaxy Virgo cluster Coma cluster Edge of observable Universe

the time and space coordinates of the accelerating person, relative to a person who remains at rest, are those of a point on the spacetime wheel, namely {t, x} = {sinh α, cosh α} . In the case where the acceleration is one Earth gravity, g = 9.80665 m s c 299,792,458 m s−1 = 0.97 yr , = g 9.80665 m s−2

(1.23) −2

, the unit of time is (1.24)

just short of one year. For simplicity, Table 1.1, which tabulates some milestones along the way, takes the unit of time to be exactly one year, which would be the case if one were accelerating at 0.97 g = 9.5 m s−2 . After a slow start, you cover ground at an ever increasing rate, crossing 50 billion lightyears, the distance to the edge of the currently observable Universe, in just over 25 years of your own time. Does this mean you go faster than the speed of light? No. From the point of view of a person at rest on Earth, you never go faster than the speed of light. From your own point of view, distances along your direction of motion are Lorentz-contracted, so distances that are vast from Earth’s point of view appear much shorter to you. Fast as the Universe rushes by, it never goes faster than the speed of light. This rosy picture of being able to flit around the Universe has drawbacks. Firstly, it would take a huge amount of energy to keep you accelerating at g. Secondly, you would use up a huge amount of Earth time travelling around at relativistic speeds. If you took a trip to the edge of the Universe, then by the time you got back not only would all your friends and relations be dead, but the Earth would probably be gone, swallowed by the Sun in its red giant phase, the Sun would have exhausted its fuel and shrivelled into a cold white dwarf star, and the Solar System, having orbited the Galaxy a thousand times, would be lost somewhere in its milky ways.

26

Special Relativity

Technical point. The Universe is expanding, so the distance to the edge of the currently observable Universe is increasing. Thus it would actually take longer than indicated in the table to reach the edge of the currently observable Universe. Moreover if the Universe is accelerating, as recent evidence from the Hubble diagram of Type Ia Supernovae suggests, then you will never be able to reach the edge of the currently observable Universe, however fast you go.

1.8 Scalar spacetime distance One of the most fundamental features of a Lorentz transformation is that its leaves invariant a certain distance. the scalar spacetime distance, between any two events in spacetime. The scalar spacetime distance ∆s between two events separated by {∆t, ∆x, ∆y, ∆z} is given by ∆s2 = − ∆t2 + ∆r2

= − ∆t2 + ∆x2 + ∆y 2 + ∆z 2 .

(1.25)

A quantity such as ∆s2 that remains unchanged under any Lorentz transformation is called a scalar. It is left to you in exercise 1.9 to show explicitly that ∆s2 is unchanged under Lorentz transformations. Lorentz transformations can be defined as linear spacetime transformations that leave ∆s2 invariant. The single scalar spacetime squared interval ∆s2 replaces the two scalar quantities

of classical Galilean spacetime.

time interval ∆t p distance interval ∆r = ∆x2 + ∆y 2 + ∆z 2

(1.26)

Exercise 1.9 Invariant spacetime interval. Show that the squared spacetime interval ∆s2 defined by equation (1.25) is unchanged by a Lorentz transformation, that is, show that − ∆t′2 + ∆x′2 + ∆y ′2 + ∆z ′2 = − ∆t2 + ∆x2 + ∆y 2 + ∆z 2 .

(1.27)

You may assume without proof the familiar results that the 3D scalar product ∆r2 = ∆x2 + ∆y 2 + ∆z 2 is unchanged by a spatial rotation, so it suffices to consider a Lorentz boost, say in the x direction.

1.8.1 Proper time, proper distance The scalar spacetime squared interval ∆s2 has a physical meaning. If an interval {∆t, ∆r} is timelike, ∆t > ∆r, then the square root of minus the spacetime interval squared is the proper time ∆τ along it p p ∆τ = −∆s2 = ∆t2 − ∆r2 . (1.28) This is the time experienced by an observer moving along that interval.

1.9 4-vectors

27

If an interval {∆t, ∆r} is spacelike, ∆t < ∆r, then the spacetime interval equals the proper distance ∆l along it p √ (1.29) ∆l = ∆s2 = ∆r2 − ∆t2 . This is the distance between two events measured by an observer for whom those events are simultaneous. Concept question 1.10

Justify the assertions (1.28) and (1.29).

1.8.2 Timelike, lightlike, spacelike A spacetime interval ∆xm is called timelike if ∆s2 < 0 , null or lightlike if ∆s2 = 0 , spacelike if ∆s2 > 0 .

(1.30)

1.8.3 Minkowski metric The scalar spacetime squared interval ∆s2 associated with an interval ∆xm = {∆t, ∆r} = {∆t, ∆x, ∆y, ∆z} can be written ∆s2 = ∆xm ∆xm = ηmn ∆xm ∆xn

(1.31)

where ηmn is the Minkowski metric

ηmn

−1  0 ≡  0 0 

0 1 0 0

0 0 1 0

 0 0   . 0  1

(1.32)

Equation (1.31) uses the implicit summation convention, according to which paired indices are explicitly summed over. Invariably, one of a pair of repeated indices is raised, the other lowered.

1.9 4-vectors A 4-vector in special relativity is a quantity am = {at , ax , ay , az } that transforms under Lorentz transformations like an interval xm = {t, x, y, z} of spacetime a′m = Lm n an where Lm n denotes a Lorentz transformation.

(1.33)

28

Special Relativity

1.9.1 Index notation In special and general relativity it is convenient to introduce two versions of the same 4-vector quantity, one with raised indices, called the contravariant components of the 4-vector, am ≡ {at , ax , ay , az } ,

(1.34)

and one with lowered indices called the covariant components of the 4-vector, am ≡ {−at , ax , ay , az }

(1.35)

(the naming is crazy, and you do not need to remember it). The indices run over m = t, x, y, z, or sometimes m = 0, 1, 2, 3. Why introduce raised and lowered indices? Because X am am ≡ am am = at at + ax ax + ay ay + az az m

= − (at )2 + (ax )2 + (ay )2 + (az )2

(1.36)

is a Lorentz scalar.

1.10 Energy-momentum 4-vector Symmetry argument: Symmetry

Conservation law

Time translation Space translation

Energy Momentum

suggests energy = momentum =

time component space component



of 4-vector.

(1.37)

The Principle of Special Relativity requires that the equation of energy-momentum conservation energy = constant momentum

(1.38)

should take the same form in any inertial frame. The equation should be Lorentz covariant, that is, the equation should transform like a Lorentz 4-vector.

1.10 Energy-momentum 4-vector

29

1.10.1 Construction of the energy-momentum 4-vector Require: 1. it’s a 4-vector 2. goes over to the Newtonian limit as v → 0. Newtonian limit: Momentum p is mass m times velocity v p = mv = m

dr . dt

(1.39)

4D version: Need to do two things to Newtonian momentum: • replace r by a 4-vector xm = {t, r} • replace dt by a scalar — the only obvious choice is the proper time interval τ . Result: dxk dτ   dt dr =m , dτ dτ = m {γ, γv}

pk = m

(1.40)

which are special relativistic versions of energy E and momentum p pk = {E, p} = {mγ, mγv} .

(1.41)

1.10.2 Special relativistic energy E = mγ

(units c = 1)

(1.42)

or, restoring standard units E = mc2 γ .

(1.43)

Taylor expand γ for small velocity v:

so

γ=p

1 1−

v 2 /c2

=1+

1 v2 + ... 2 c2

  1 v2 E = mc2 1 + + ... 2 c2 1 = mc2 + mv 2 + ... . 2

(1.44)

(1.45)

The first term, mc2 , is the rest-mass energy. The second term, 21 mv 2 , is the non-relativistic kinetic energy. Higher-order terms give relativistic corrections to the kinetic energy.

30

Special Relativity

1.10.3 Rest mass is a scalar The scalar quantity constructed from the energy-momentum 4-vector pk = {E, p} is pk pk = − E 2 + p2

= − m2 (γ 2 − γ 2 v 2 ) = − m2

(1.46)

minus the square of the rest mass.

1.11 Photon energy-momentum Photons have zero rest mass m=0.

(1.47)

p k p k = − E 2 + p 2 = − m2 = 0

(1.48)

p ≡ |p| = E .

(1.49)

Thus

whence

Hence pk = {E, p}

= E{1, n}

= hν{1, n}

(1.50)

where ν is the photon frequency. The photon velocity is n, a unit vector. The photon speed is one (the speed of light).

1.11.1 Lorentz transformation of photon energy-momentum 4-vector Follows the usual rules for 4-vectors. In the case that the Lorentz transformation is a Lorentz boost along the x-axis, the transformation is  ′t    t    p γ −γv 0 0 p γ(pt − vpx ) x t   p′x   −γv  x   γ 0 0   =   p  =  γ(p − vp )  . (1.51) ′y y y  p   0  0 1 0  p   p p′z 0 0 0 1 pz pz

1.12 Abstract 4-vectors

31

Equivalently   1 γ ′x    n −γv   hν ′   n′y  =  0 n′z 0 

−γv γ 0 0

0 0 1 0

  0 1  nx 0   hν  0   ny 1 nz

 γ(1 − nx v) x     = hν  γ(n − v)  . y    n nz 



These mathematical relations imply the rules of 4-dimensional perspective, §1.13.1.

1.11.2 Redshift Astronomers define the redshift z of a photon by z≡

λobs − λemit . λemit

(1.52)

In relativity, it is often more convenient to use the redshift factor 1 + z 1+z ≡

λobs νemit = . λemit νobs

(1.53)

1.11.3 Special relativistic Doppler shift If the emitter frame (primed) is moving with velocity v in the x-direction relative to the observer frame (unprimed) then hνemit = hνobs γ(1 − nx v)

(1.54)

so νemit νobs = γ(1 − nx v)

1+z =

= γ(1 − n.v) .

(1.55)

This is the general formula for the special relativistic Doppler shift.

1.12 Abstract 4-vectors A = Am γm

(1.56)

32

Special Relativity

1.13 What things look like at relativistic speeds 1.13.1 The rules of 4-dimensional perspective The diagram below illustrates the rules of 4-dimensional perspective, also called “special relativistic beaming,” which describe how a scene appears when you move through it at near light speed.

1

1

γv γ

On the left, you are at rest relative to the scene. Imagine painting the scene on a celestial sphere around you. The arrows represent the directions of light rays (photons) from the scene on the celestial sphere to you at the center. On the right, you are moving to the right through the scene, at some fraction of the speed of light. The celestial sphere is stretched along the direction of your motion into a celestial ellipsoid. You, the observer, are not at the center of the ellipsoid, but rather at one of its foci (the left one, if you are moving to the right). The scene appears relativistically aberrated, which is to say concentrated ahead of you, and expanded behind you. The lengths of the arrows are proportional to the energies, or frequencies, of the photons that you see. When you are moving through the scene at near light speed, the arrows ahead of you, in your direction of motion, are longer than at rest, so you see the photons blue-shifted, increased in energy, increased in frequency. Conversely, the arrows behind you are shorter than at rest, so you see the photons red-shifted, decreased in energy, decreased in frequency. Since photons are good clocks, the change in photon frequency also tells you how fast or slow clocks attached to the scene appear to you to run. Numbers? On the right, you are moving through the scene at v = 0.6 c. The celestial √ ellipsoid is stretched along the direction of your motion by the Lorentz gamma factor, which here is γ = 1/ 1 − 0.62 = 1.25. The focus of the celestial ellipsoid, where you the observer are, is displaced from center by γv = 1.25 × 0.6 = 0.75.

1.14 How to programme Lorentz transformations on a computer

33

1.14 How to programme Lorentz transformations on a computer 3D gaming programmers are familiar with the fact that the best way to program spatial rotations on a computer is with quaternions. Compared to standard rotation matrices, quaternions offer increased speed and require less storage, and their algebraic properties simplify interpolation and splining. Section 1.7 showed that a Lorentz boost is mathematically equivalent to a rotation by an imaginary angle. Thus suggests that Lorentz transformations might be treated as complexified spatial rotations, which proves to be true. Indeed, the best way to program Lorentz transformation on a computer is with complex quaternions, as will be demonstrated in Chapter 14.

Exercises Exercise 1.11 Twin paradox. Your twin leaves you on Earth and travels to the spacestation Alpha, ℓ = 3 lyr away, at a good fraction of the speed of light, then immediately returns to Earth at the same speed. The accompanying spacetime diagram shows the corresponding worldlines of both you and your twin. Aside from part (a) and the first part of (b), I want you to derive your answers mathematically, using logic and Lorentz transformations. However, the diagram is accurately drawn, and you should be able to check your answers by measuring. 1. Label the worldlines of you and your twin. Draw the worldline of a light signal which travels from you on Earth, hits Alpha just when your twin arrives, and immediately returns to Earth. Draw the twin’s “now” when just arriving at Alpha, and the twin’s “now” just departing from Alpha (in the first case the twin is moving toward Alpha, while in the second case the twin is moving back toward Earth). 2. From the diagram, measure the twin’s speed v relative to you, in units where the speed of light is unity, c = 1. Deduce the Lorentz gamma factor γ, and the redshift factor 1 + z = [(1 + v)/(1 − v)]1/2 , in the cases (i) where the twin is receding, and (ii) where the twin is approaching. 3. Choose the spacetime origin to be the event where the twin leaves Earth. Argue that the position 4-vector of the twin on arrival at Alpha is {t, x, y, z} = {ℓ/v, ℓ, 0, 0} .

(1.57)

Lorentz transform this 4-vector to determine the position 4-vector of the twin on arrival at Alpha, in the twin’s frame. Express your answer first in terms of ℓ, v, and γ, and then in (light)years. State in words what this position 4-vector means. 4. How much do you and your twin age respectively during the round trip to Alpha and back? What is the ratio of these ages? Express your answers first in terms of ℓ, v, and γ, and then in years. 5. What is the distance between the Earth and Alpha from the twin’s point of view? What is the ratio of this distance to the distance between Earth and Alpha from your point of view? Explain how your arrived at your result. Express your answer first in terms of ℓ, v, and γ, and then in lightyears. 6. You watch your twin through a telescope. How much time do you see (through the telescope) elapse

34

Special Relativity

on your twin’s wristwatch between launch and arrival on Alpha? How much time passes on your own wristwatch during this time? What is the ratio of these two times? Express your answers first in terms of ℓ, v, and γ, and then in years. 7. On arrival at Alpha, your twin looks back through a telescope at your wristwatch. How much time does your twin see (through the telescope) has elapsed since launch on your watch? How much time has elapsed on the twin’s own wristwatch during this time? What is the ratio of these two times? Express your answers first in terms of ℓ, v, and γ, and then in years. 8. You continue to watch your twin through a telescope. How much time elapses on your twin’s wristwatch, as seen by you through the telescope, during the twin’s journey back from Alpha to Earth? How much time passes on your own watch as you watch (through the telescope) the twin journey back from Alpha to Earth? What is the ratio of these two times? Express your answers first in terms of ℓ, v, and γ, and then in years. 9. During the journey back from Alpha to Earth, your twin likewise continues to look through a telescope at the time registered on your watch. How much time passes on your wristwatch, as seen by your twin through the telescope, during the journey back? How much time passes on the twin’s wristwatch from the twin’s point of view during the journey back? What is the ratio of these two times? Express your answers first in terms of ℓ, v, and γ, and then in years. Exercise 1.12 Lines intersecting at right angles. Prove that if two lines appear to intersect at rightangles projected on the sky in one frame, then they appear to intersect at right-angles in another frame Lorentz-transformed with respect to the first.

PART TWO COORDINATE APPROACH TO GENERAL RELATIVITY

Concept Questions

1. 2. 3. 4. 5. 6.

7. 8. 9. 10. 11.

12.

13. 14. 15. 16.

What assumption of general relativity makes it possible to introduce a coordinate system? Is the speed of light a universal constant in general relativity? If so, in what sense? What does “locally inertial” mean? How local is local? Why is spacetime locally inertial? What assumption of general relativity makes it possible to introduce clocks and rulers? Consider two observers at the same point and with the same instantaneous velocity, but one is accelerating and the other is in free-fall. What is the relation between the proper time or proper distance along an infinitesimal interval measured by the two observers? What assumption of general relativity implies this? Does the (Strong) Principle of Equivalence imply that two unequal masses will fall at the same rate in a gravitational field? Explain. In what respects is the Strong Principle of Equivalence (gravity is equivalent to acceleration) stronger than the Weak Principle of Equivalence (gravitating mass equals inertial mass)? Standing on the surface of the Earth, you hold an object of negative mass in your hand, and drop it. According to the Principle of Equivalence, does the negative mass fall up or down? Same as the previous question, but what does Newtonian gravity predict? You have a box of negative mass particles, and you remove energy from it. Do the particles move faster or slower? Does the entropy of the box increase or decrease? Does the pressure exerted by the particles on the walls of the box increase or decrease? You shine two light beams along identical directions in a gravitational field. The two light beams are identical in every way except that they have two different frequencies. Does the Equivalence Principle imply that the interference pattern produced by each of the beams individually is the same? What is a “straight line”, according to the Principle of Equivalence? If all objects move on straight lines, how is it that when, standing on the surface of the Earth, you throw two objects in the same direction but with different velocities, they follow two different trajectories? In relativity, what is the generalization of the “shortest distance between two points”? What kinds of general coordinate transformations are allowed in general relativity?

38

Concept Questions

17. In general relativity, what is a scalar? A 4-vector? A tensor? Which of the following is a scalar/vector/ tensor/none-of-the-above? (a) a set of coordinates xµ ; (b) a coordinate interval dxµ ; (c) proper time τ? 18. What does general covariance mean? 19. What does parallel transport mean? 20. Why is it important to define covariant derivatives that behave like tensors? 21. Is covariant differentiation a derivation? That is, is covariant differentiation a linear operation, and does it obey the Leibniz rule for the derivative of a product? 22. What is the covariant derivative of the metric tensor? Explain. 23. What does a connection coefficient Γκµν mean physically? Is it a tensor? Why, or why not? 24. An astronaut is in free-fall in orbit around the Earth. Can the astronaut detect that there is a gravitational field? 25. Can a gravitational field exist in flat space? 26. How can you tell whether a given metric is equivalent to the Minkowski metric of flat space? 27. How many degrees of freedom does the metric have? How many of these degrees of freedom can be removed by arbitrary transformations of the spacetime coordinates, and therefore how many physical degrees of freedom are there in spacetime? 28. If you insist that the spacetime is spherical, how many physical degrees of freedom are there in the spacetime? 29. If you insist that the spacetime is spatially homogeneous and isotropic (the cosmological principle), how many physical degrees of freedom are there in the spacetime? 30. In general relativity, you are free to prescribe any spacetime (any metric) you like, including metrics with wormholes and metrics that connect the future to the past so as to violate causality. True or false? 31. If it is true that in general relativity you can prescribe any metric you like, then why aren’t you bumping into wormholes and causality violations all the time? 32. How much mass does it take to curve space significantly (significantly meaning by of order unity)? 33. What is the relation between the energy-momentum 4-vector of a particle and the energy-momentum tensor? 34. It is straightforward to go from a prescribed metric to the energy-momentum tensor. True or false? 35. It is straightforward to go from a prescribed energy-momentum tensor to the metric. True or false? 36. Does the Principle of Equivalence imply Einstein’s equations? 37. What do Einstein’s equations mean physically? 38. What does the Riemann curvature tensor Rκλµν mean physically? Is it a tensor? 39. The Riemann tensor splits into compressive (Ricci) and tidal (Weyl) parts. What do these parts mean, physically? 40. Einstein’s equations imply conservation of energy-momentum, but what does that mean? 41. Do Einstein’s equations describe gravitational waves? 42. Do photons (massless particles) gravitate? 43. How do different forms of mass-energy gravitate? 44. How does negative mass gravitate?

What’s important?

This part of the notes adopts the traditional coordinate-based approach to general relativity. The approach is neither the most insightful nor the most powerful, but it is the fastest route to connecting the metric to the energy-momentum content of spacetime. 1. Postulates of general relativity. How do the various postulates imply the mathematical structure of general relativity? 2. The road from spacetime curvature to energy-momentum: metric gµν → connection coefficients Γκµν → Riemann curvature tensor Rκλµν → Ricci tensor Rκµ and scalar R → Einstein tensor Gκµ = Rκµ − 21 gκµ R → energy-momentum tensor Tκµ 3. 4-velocity and 4-momentum. Geodesic equation. 4. Bianchi identities guarantee conservation of energy-momentum.

2 Fundamentals of General Relativity

2.1 The postulates of General Relativity General relativity follows from three postulates: 1. Spacetime is a 4-dimensional manifold; 2. The (Strong) Principle of Equivalence; 3. Einstein’s Equations.

2.1.1 Spacetime is a 4-dimensional manifold A 4-dimensional manifold is defined mathematically to be a topological space that is locally homeomorphic to Euclidean 4-space R4 . This postulate implies that it is possible to set up a coordinate system (possibly in patches) xµ ≡ {x0 , x1 , x2 , x3 } such that each point of (the patch of) spacetime has a unique coordinate. Andrew’s convention: Greek (brown) dummy indices label curved spacetime coordinates. Latin (black) dummy indices label locally inertial (more generally, tetrad) coordinates.

2.1.2 (Strong) Principle of Equivalence (PE) “The laws of physics in a gravitating frame are equivalent to those in an accelerating frame”. The Weak Principle of Equivalence is “Gravitating mass = inertial mass”. PE ⇒ spacetime is locally inertial (see §2.2).

(2.1)

2.2 Existence of locally inertial frames

41

2.1.3 Einstein’s equations Einstein’s equations comprise a 4 × 4 symmetric matrix of equations Gµν = 8πGTµν .

(2.2)

Here G is the Newtonian gravitational constant, Gµν is the Einstein tensor, and Tµν is the energymomentum tensor. Physically, Einstein’s equations signify (compressive part of) curvature = energy-momentum content .

(2.3)

Einstein’s equations generalize Poisson’s equation ∇2 Φ = 4πGρ

(2.4)

where Φ is the Newtonian gravitational potential, and ρ the mass-energy density. Poisson’s equation is the time-time component of Einstein’s equations in the limit of a weak gravitational field and slowly moving matter.

2.2 Existence of locally inertial frames The Principle of Equivalence implies that at each point of spacetime it is possible to choose a locally inertial, or free-fall, frame, such that the laws of special relativity apply within an infinitesimal neighbourhood of that point. By this is meant that at each point of spacetime it is possible to choose coordinates such that (a) the metric at that point is Minkowski, and (b) the first derivatives of the metric are all zero. It is built into the Principle of Equivalence that general relativity is, like special relativity, a metric theory. Notably, the proper times and distances measured by an accelerating observer are the same as those measured by a freely-falling observer at the same point and with the same instantaneous velocity.

2.3 Metric The metric is the essential mathematical object that converts an infinitesimal coordinate interval dxµ ≡ {dx0 , dx1 , dx2 , dx3 }

(2.5)

to a proper measurement of an interval of time or space. Postulate (1) of general relativity means that it is possible to choose coordinates xµ ≡ {x0 , x1 , x2 , x3 } covering (a patch of) spacetime.

(2.6)

42

Fundamentals of General Relativity

Postulate (2) of general relativity implies that at each point of spacetime it is possible to choose locally inertial coordinates ξ m ≡ {ξ 0 , ξ 1 , ξ 2 , ξ 3 }

(2.7)

ds2 = ηmn dξ m dξ n ,

(2.8)

such that the metric is Minkowski, in an infinitesimal neighborhood of the point. The spacetime distance squared ds2 is a scalar, a quantity that is unchanged by the choice of coordinates. Since ∂ξ m µ dx (2.9) dξ m = ∂xµ it follows that ∂ξ m ∂ξ n µ ν dx dx (2.10) ds2 = ηmn ∂xµ ∂xν so the scalar spacetime distance squared is ds2 = gµν dxµ dxν where gµν is the metric, a 4 × 4 symmetric matrix gµν = ηmn

∂ξ m ∂ξ n . ∂xµ ∂xν

(2.11)

(2.12)

2.4 Basis gµ of tangent vectors You are familiar with the idea that in ordinary 3D Euclidean geometry it is often convenient to treat vectors in an abstract coordinate-independent formalism. Thus for example a 3-vector is commonly written as an abstract quantity r. The coordinates of the vector r may be {x, y, z} in some particular coordinate system, but one recognizes that the vector r has a meaning, a magnitude and a direction, that is independent of the coordinate system adopted. In an arbitrary Cartesian coordinate system, the Euclidean 3-vector r can be expressed X x ˆi xi = x ˆx+y ˆ y + zˆ z (2.13) r= i

where x ˆi ≡ {ˆ x, yˆ, zˆ} are unit vectors along each of the coordinate axes. The same kind of abstract notation is useful in general relativity. Define gµ gµ ≡ {g0 , g1 , g2 , g3 }

(2.14)

to be the basis of axes tangent to the coordinates xµ . Each axis gµ is a 4D vector object, with both magnitude and direction in spacetime. (Some texts represent the tangent vectors gµ with the notation ∂µ , but this notation is not used here, to avoid the potential confusion between ∂µ as a derivative and ∂µ as a vector.)

2.5 4-vectors and tensors

43

An interval dxµ of spacetime can be expressed in coordinate-independent fashion as the abstract vector interval dx dx ≡ gµ dxµ = g0 dx0 + g1 dx1 + g2 dx2 + g3 dx3 .

(2.15)

The scalar length squared of the abstract vector interval dx is ds2 = dx · dx = gµ · gν dxµ dxν

(2.16)

gµν = gµ · gν

(2.17)

whence

the metric is the (4D) scalar product of tangent vectors. The tangent vectors gµ form a basis for a 4D tangent space that has three important mathematical properties. First, the tangent space is a vector space, that is, it has the properties of linearity that define a vector space. Second, the tangent space has an inner (or scalar) product, defined by the metric (2.17). Third, vectors in the tangent space can be differentiated with respect to coordinates, as will be elucidated in §2.6.3.

2.5 4-vectors and tensors 2.5.1 Contravariant coordinate 4-vector Under a general coordinate transformation xµ → x′µ

(2.18)

a coordinate interval dxµ transforms as dx′µ =

∂x′µ ν dx . ∂xν

(2.19)

In general relativity, a coordinate 4-vector is defined to be a quantity Aµ = {A0 , A1 , A2 , A3 } that transforms under a coordinate transformation (2.18) like a coordinate interval A′µ =

∂x′µ ν A . ∂xν

(2.20)

2.5.2 Abstract 4-vector A 4-vector may be written in coordinate-independent fashion as A = gµ Aµ .

(2.21)

The quantity A is an abstract 4-vector. Although A is a 4-vector, it is by construction unchanged by a coordinate transformation, and is therefore a coordinate scalar. See §2.5.6 for commentary on the distinction between abstract and coordinate vectors.

44

Fundamentals of General Relativity

2.5.3 Lowering and raising indices Define g µν to be the inverse metric, satisfying

gλµ g µν

1  0 ν = δλ =   0 0 

0 1 0 0

0 0 1 0

 0 0   . 0  1

(2.22)

The metric gµν and its inverse g µν provide the means of lowering and raising coordinate indices. The components of a coordinate 4-vector Aµ with raised index are called its contravariant components, while those Aµ with lowered indices are called its covariant components, Aµ = gµν Aν ,

(2.23)

Aµ = g µν Aν .

(2.24)

2.5.4 Covariant coordinate 4-vector Under a general coordinate transformation (2.18), the covariant components Aµ of a coordinate 4-vector transform as ∂xν Aν . (2.25) Aµ′ = ∂x′µ You can check that the transformation law (2.25) for the covariant components Aµ is consisistent with the transformation law (2.20) for the contravariant components Aµ . You can check that the tangent vectors gµ transform as a covariant coordinate 4-vector.

2.5.5 Scalar product µ

µ

If A and B are coordinate 4-vectors, then their scalar product is Aµ B µ = Aµ Bµ = gµν Aµ B ν .

(2.26)

This is a coordinate scalar, a quantity that remains invariant under general coordinate transformations. In abstract vector formalism, the scalar product of two 4-vectors A = gµ Aµ and B = gµ B µ is A · B = gµ · gν Aµ B ν = gµν Aµ B ν .

(2.27)

2.5.6 Comment on vector naming and notation Different texts follow different conventions for naming and notating vectors and tensors. In this book I follow the convention of calling both Aµ (with a dummy index µ) and A ≡ Aµ gµ vectors. Although Aµ and A are both vectors, they are mathematically different objects.

2.6 Covariant derivatives

45

If the index on a vector indicates a specific coordinate, then the indexed vector is the component of the vector; for example A0 (or At ) is the x0 (or time t) component of the coordinate 4-vector Aµ . In this book, the different species of vector are distinguished by an adjective: 1. A coordinate vector Aµ , identified by Greek (brown) indices µ, is one that changes in a prescribed way under coordinate transformations. A coordinate transformation is one that changes the coordinates of the spacetime without actually changing the spacetime or whatever lies in it. 2. An abstract vector A, identified by boldface, is the thing itself, and is unchanged by the choice of coordinates. Since the abstract vector is unchanged by a coordinate transformation, it is a coordinate scalar. All the types of vector have the properties of linearity (additivity, multiplication by scalars) that identify them mathematically as belonging to vector spaces. The important distinction between the types of vector is how they behave under transformations. In referring to both Aµ and A as vectors, I am following the standard physics practice of mentally regarding Aµ and A as equivalent objects. You are familiar with the advantages of treating a vector in 3D Euclidean space either as an abstract vector A, or as a coordinate vector Ai . Depending on the problem, sometimes the abstract notation A is more convenient, and sometimes the coordinate notation Ai is more convenient. Sometimes it’s convenient to switch between the two in the middle of a calculation. Likewise in general relativity it is convenient to have the flexibility to work in either coordinate or abstract notation, whatever suits the problem of the moment.

2.5.7 Coordinate tensor In general, a coordinate tensor Aκλ... µν... is an object that transforms under general coordinate transformations (2.18) as A′κλ... µν... =

∂x′κ ∂x′λ ∂xγ ∂xδ ... ′µ ′ν ... Aαβ... γδ... . α β ∂x ∂x ∂x ∂x

(2.28)

You can check that the metric tensor gµν and its inverse g µν are indeed coordinate tensors, transforming like (2.28). The rank of a tensor is the number of indices. A scalar is a tensor of rank 0. A 4-vector is a tensor of rank 1.

2.6 Covariant derivatives 2.6.1 Derivative of a coordinate scalar Suppose that Φ is a coordinate scalar. Then the coordinate derivative of Φ is a coordinate 4-vector ∂Φ ∂xµ

is a coordinate tensor

(2.29)

46

Fundamentals of General Relativity

transforming like equation (2.25). As a shorthand, the ordinary partial derivative is often denoted in the literature with a comma ∂Φ = Φ,µ . ∂xµ For the most part this book does not use the comma notation.

(2.30)

2.6.2 Derivative of a coordinate 4-vector The ordinary partial derivative of a covariant coordinate 4-vector Aµ is not a tensor ∂Aµ is not a coordinate tensor (2.31) ∂xν because it does not transform like a coordinate tensor. However, the 4-vector A = gµ Aµ , being by construction invariant under coordinate transformations, is a coordinate scalar, and its partial derivative is a coordinate 4-vector ∂gµ Aµ ∂A = ν ∂x ∂xν ∂Aµ ∂gµ = gµ ν + ν Aµ is a coordinate tensor . (2.32) ∂x ∂x The last line of equation (2.32) assumes that it is legitimate to differentiate the tangent vectors gµ , but what does this mean? The partial derivatives of basis vectors gµ are defined in the usual way by gµ (x0 , ..., xν +δxν , ..., x3 ) − gµ (x0 , ..., xν , ..., x3 ) ∂gµ ≡ lim . (2.33) δxν →0 ∂xν δxν This definition relies on being able to compare the vectors gµ (x) at some point x with the vectors gµ (x+δx) at another point x+δx a small distance away. The comparison between two vectors a small distance apart is made possible by the existence of locally inertial frames. In a locally inertial frame, two vectors a small distance apart can be compared by parallel-transporting one vector to the location of the other along the small interval between them, that is, by transporting the vector without accelerating or precessing with respect to the locally inertial frame. Thus gµ (x+δx) in the definition (2.33) should be interpreted as its value parallel-transported from position x+δx to position x along the small interval δx between them.

2.6.3 Coordinate connection coefficients (Christoffel symbols) The partial derivatives of the basis vectors gµ that appear on the right hand side of equation (2.32) define the coordinate connection coefficients Γκµν , also known as Christoffel symbols, ∂gµ ≡ Γκµν gκ ∂xν

is not a coordinate tensor .

(2.34)

The definition (2.34) shows that the connection coefficients express how each tangent vector gµ changes, relative to parallel-transport, when shifted along an interval δxν .

2.6 Covariant derivatives

47

2.6.4 Covariant derivative of a contravariant 4-vector Expression (2.32) along with the definition (2.34) of the connection coefficients implies that ∂A ∂Aµ = g + Γκµν gκ Aµ µ ν ∂xν ∂x  κ  ∂A κ µ = gκ + Γ A µν ∂xν

is a coordinate tensor .

(2.35)

The expression in parentheses is a coordinate tensor, and defines the covariant derivative Dν Aκ of the contravariant coordinate 4-vector Aκ Dν Aκ ≡

∂Aκ + Γκµν Aµ ∂xν

is a coordinate tensor .

(2.36)

As a shorthand, the covariant derivative is often denoted in the literature with a semi-colon Dν Aκ = Aκ;ν .

(2.37)

For the most part this book does not use the semi-colon notation.

2.6.5 Covariant derivative of a covariant coordinate 4-vector Similarly, ∂A = g κ Dν Aκ ∂xν

is a coordinate tensor

(2.38)

where Dν Aκ is the covariant derivative of the covariant coordinate 4-vector Aκ Dν Aκ ≡

∂Aκ − Γµκν Aµ ∂xν

is a coordinate tensor .

(2.39)

2.6.6 Covariant derivative of a coordinate tensor In general, the covariant derivative of a coordinate tensor is Dα Aκλ... µν... =

∂Aκλ... µν... λ κβ... β κλ... β κλ... + Γκβα Aβλ... µν... + Γβα Aµν... + ... − Γµα Aβν... − Γνα Aµβ... − ... ∂xα

(2.40)

with a positive Γ term for each contravariant index, and a negative Γ term for each covariant index.

2.6.7 No-torsion condition The existence of locally inertial frames requires that it must be possible to arrange not only that the tangent axes gµ are orthonormal at a point, but also that they remain orthonormal to first order in a Taylor expansion

48

Fundamentals of General Relativity

about the point. That is, it must be possible to choose the coordinates such that the tangent axes gµ are orthonormal, and unchanged to linear order: gµ · gν = ηµν , ∂gµ =0. ∂xν

(2.41) (2.42)

In view of the definition (2.34) of the connection coefficients, the second condition (2.42) is equivalent to the vanishing of all the connection coefficients: Γκµν = 0 .

(2.43)

Under a general coordinate transformation xµ → x′µ , the tangent axes transform as gµ → gµ′ = ∂xν /∂x′µ gν . The 4 × 4 matrix ∂xν /∂x′µ of partial derivatives provides 16 degrees of freedom in choosing the tangent axes at a point. The 16 degrees of freedom are enough — more than enough — to accomplish the orthonormality condition (2.41), which is a symmetric 4 × 4 matrix equation with 10 degrees of freedom. The additional 16 − 10 = 6 degrees of freedom are Lorentz transformations, which rotate the tangent axes gµ , but leave the metric ηµν unchanged. Just as it is possible to reorient the tangent axes gµ at a point by adjusting the matrix ∂x′ν /∂xµ of first partial derivatives of the coordinate transformation xµ → x′µ , so also it is possible to reorient the derivatives ∂gµ /∂xν of the tangent axes by adjusting the matrix ∂ 2 x′ν /∂xλ ∂xµ of second partial derivatives. The second partial derivatives comprise a set of 4 symmetric 4 × 4 matrices, for a total of 4 × 10 = 40 degrees of freedom. However, there are 4 × 4 × 4 = 64 connection coefficients Γκµν , all of which the condition (2.43) requires to vanish. The matrix of second derivatives is thus 64 − 40 = 24 degrees of freedom short of being able to make all the connections vanish. The resolution of the problem is that, as shown below, equation (2.51), there are 24 combinations of the connections that form a tensor, the torsion tensor. If a tensor is zero in one frame, then it is automatically zero in any other frame. Thus the requirement that all the connections vanish requires that the torsion tensor vanish. This requires, from the expression (2.51) for the torsion tensor, the no-torsion condition that the connection coefficients are symmetric in their last two indices Γκµν = Γκνµ .

(2.44)

It should be emphasized that the condition of vanishing torsion is an assumption of general relativity, not a mathematical necessity. It has been shown in this section that torsion vanishes if and only if spacetime is locally flat, meaning that at any point coordinates can be found such that conditions (2.41) are true. The assumption of local flatness is central to the idea of the principle of equivalence. But it is an assumption, not a consequence, of the theory. Concept question 2.1 If torsion does not vanish, then there is no locally inertial frame. What does parallel-transport mean in such a case?

2.6 Covariant derivatives

49

2.6.8 Aside: torsion and the integrability of the position vector From the definitions (2.34) of the connection coefficients, the no-torsion condition (2.44) is equivalent to ∂gµ ∂gν − µ =0. ∂xν ∂x

(2.45)

According to Frobenius’ theorem, this condition (2.45) is precisely the condition for the system gµ to be integrable, that is, there exists a position vector X whose partial derivatives are gµ =

∂X . ∂xµ

(2.46)

Equivalently, the total differential of the position vector X is dX = gµ dxµ .

(2.47)

The abstract vector interval dx ≡ gµ dxµ was defined by equation (2.15) as the coordinate-independent version of a spacetime interval dxµ . The notation dx was merely symbolic: dx was not necessarily a total differential of something. However, the no-torsion condition (2.47) implies that dx is in fact the total differential dX of the position vector X dx = dX .

(2.48)

The no-torsion condition (2.44) is equivalent to the commutation of partial derivatives of X: Γκµν gκ ≡

∂gµ ∂2X ∂gν = = ≡ Γκνµ gκ . ∂xν ∂xν ∂xµ ∂xµ

(2.49)

The physical meaning of torsion is discussed further in §3.4.

2.6.9 Torsion tensor General relativity assumes no torsion, but it is possible to consider generalizations to theories with torsion. µ is defined by the commutator of the covariant derivative acting on a scalar Φ The torsion tensor Sκλ µ [Dκ , Dλ ] Φ = Sκλ

∂Φ ∂xµ

is a coordinate tensor .

(2.50)

Note that the covariant derivative of a scalar is just the ordinary derivative, Dλ Φ = ∂Φ/∂xλ . The expression (2.39) for the covariant derivatives shows that the torsion tensor is µ Sκλ = Γµκλ − Γµλκ

is a coordinate tensor

(2.51)

which is evidently antisymmetric in the indices κλ. In Einstein-Cartan theory, the torsion tensor is related to the spin content of spacetime. Since this vanishes in empty space, Einstein-Cartan theory is indistinguishable from general relativity in experiments carried out in vacuum.

50 Exercise 2.2

Fundamentals of General Relativity Show that Dκ Aλ − Dλ Aκ =

∂Aλ ∂Aκ µ − + Sκλ Aµ . ∂xκ ∂xλ

(2.52)

µ Conclude that, if torsion vanishes as general relativity assumes, Sκλ = 0, then

Dκ Aλ − Dλ Aκ =

∂Aλ ∂Aκ − . ∂xκ ∂xλ

(2.53) ⋄

2.6.10 Connection coefficients in terms of the metric The connection coefficients have been defined, equation (2.34), as derivatives of the tangent basis vectors gµ . However, the connection coefficients can be expressed purely in terms of the (first derivatives of the) metric, without reference to the individual basis vectors. The partial derivatives of the metric are ∂gλ · gµ ∂gλµ = ∂xν ∂xν ∂gµ ∂gλ = gλ · + gµ · ∂xν ∂xν κ = gλ · gκ Γµν + gµ · gκ Γκλν = gλκ Γκµν + gµκ Γκλν = Γλµν + Γµλν ,

(2.54)

which is a sum of two connection coefficients. Here Γλµν with all indices lowered is defined to be Γκµν with the first index lowered by the metric, Γλµν ≡ gλκ Γκµν .

(2.55)

Combining the metric derivatives in the following fashion yields an expression for a single connection: ∂gλν ∂gµν ∂gλµ + − = Γλµν + Γµλν + Γλνµ + Γνλµ − Γµνλ − Γνµλ ν µ ∂x ∂x ∂xλ = 2 Γλµν − Sλµν − Sµνλ − Sνµλ = 2 Γλµν ,

(2.56)

the last line of which follows from the no-torsion condition Sλµν = 0. Thus Γλµν

1 = 2



∂gλµ ∂gλν ∂gµν + − ν µ ∂x ∂x ∂xλ



is not a coordinate tensor .

This is the formula that allows connection coefficients to be calculated from the metric.

(2.57)

2.7 Coordinate 4-velocity

51

2.6.11 Mathematical aside General relativity is a metric theory. Many of the structures introduced above can be defined mathematically without a metric. For example, it is possible to define the tangent space of vectors with basis gµ , and to define a dual vector space with basis g µ such that gµ · g ν = δµν . Elements of the dual vector space are commonly called one-forms. Similarly it is possible to define connections and covariant derivatives without a metric. However, this book follows general relativity in assuming that spacetime has a metric.

2.7 Coordinate 4-velocity Consider a particle following a worldline xµ (τ ) ,

(2.58) √ where τ is the particle’s proper time. The proper time along any interval of the worldline is dτ ≡ −ds2 . Define the coordinate 4-velocity uµ by uµ ≡

dxµ dτ

is a coordinate 4-vector .

(2.59)

The magnitude squared of the 4-velocity is constant uµ uµ = gµν

ds2 dxµ dxν = 2 = −1 . dτ dτ dτ

(2.60)

The negative sign arises from the choice of metric signature: with the signature −+++ adopted here, there is a − sign between ds2 and dτ 2 . Equation (2.60) can be regarded as an integral of motion associated with conservation of particle rest mass.

2.8 Geodesic equation Let u ≡ gµ uµ be the 4-velocity in coordinate-independent notation. The principle of equivalence implies that the geodesic equation, the equation of motion of a freely-falling particle, is du =0 . dτ

(2.61)

Why? Because du/dτ = 0 in the particle’s own free-fall frame, and the equation is coordinate-independent. In the particle’s own free-fall frame, the particle’s 4-velocity is uµ = {1, 0, 0, 0}, and the particle’s locally inertial axes gµ = {g0 , g1 , g2 , g3 } are constant.

52

Fundamentals of General Relativity

What does the equation of motion look like in coordinate notation? The acceleration is du dxν ∂u = dτ dτ ∂xν = u ν g κ Dν u κ  κ  ∂u ν κ µ = u gκ + Γµν u ∂xν  κ  du κ µ ν . = gκ + Γµν u u dτ

(2.62)

The geodesic equation is then duκ + Γκµν uµ uν = 0 . dτ

(2.63)

Another way of writing the geodesic equation is Duκ =0, Dτ

(2.64)

where D/Dτ is the covariant proper time derivative D ≡ u ν Dν . Dτ

(2.65)

2.9 Coordinate 4-momentum The coordinate 4-momentum of a particle of rest mass m is defined to be pµ ≡ muµ = m

dxµ dτ

is a coordinate 4-vector .

(2.66)

The momentum squared is pµ pµ = m2 uµ uµ = −m2

(2.67)

minus the square of the rest mass. Again, the minus sign arises from the choice −+++ of metric signature.

2.10 Affine parameter For photons, the rest mass is zero, m = 0, but the 4-momentum pµ remains finite. Define the affine parameter λ by λ≡

τ m

is a coordinate scalar

(2.68)

2.11 Affine distance

53

which remains finite in the limit m → 0. The affine parameter λ is unique up to an overall linear transformation (that is, αλ + β is also an affine parameter, for constant α and β), because of the freedom in the choice of mass m and the zero point of proper time τ . In terms of the affine parameter, the 4-momentum is pµ =

dxµ . dλ

(2.69)

The geodesic equation is then in coordinate-independent notation dp =0, dλ

(2.70)

dpκ + Γκµν pµ pν = 0 , dλ

(2.71)

or in component form

which works for massless as well as massive particles. Another way of writing this is Dpκ =0, Dλ

(2.72)

where D/Dλ is the covariant affine derivative D ≡ pν Dν . Dλ

(2.73)

2.11 Affine distance The affine parameter is also called the affine distance, because it provides a measure of distance along null geodesics. When you look at a scene with your eyes, you are looking along null geodesics, and the natural measure of distance to objects that you see is the affine distance. The freedom in the overall scaling of the affine distance is fixed by setting it equal to the proper distance near the observer in the observer’s locally inertial rest frame. In special relativity, the affine distance coincides with the perceived (e.g. binocular) distance to objects.

2.12 Riemann curvature tensor The Riemann curvature tensor Rκλµν is defined by the commutator of the covariant derivative acting on a 4-vector [Dκ , Dλ ] Aµ = Rκλµν Aν

is a coordinate tensor .

(2.74)

The expression (2.74) assumes vanishing torsion; the more general expression with non-zero torsion is (3.20).

54

Fundamentals of General Relativity

The expression (2.39) for the covariant derivative yields the following formula for the Riemann tensor in terms of connection coefficients Rκλµν =

∂Γµνλ ∂Γµνκ α − + Γα µλ Γανκ − Γµκ Γανλ ∂xκ ∂xλ

is a coordinate tensor .

(2.75)

This is the formula that allows the Riemann tensor to be calculated from the connection coefficients. In flat (Minkowski) space, covariant derivatives reduce to partial derivatives, Dκ → ∂/∂xκ , and   ∂ ∂ [Dκ , Dλ ] → , = 0 in flat space (2.76) ∂xκ ∂xλ so that Rκλµν = 0 in flat space. Comment: In quantum field theories (QED, QCD), the commutator of the gauge-covariant derivative is taken to be the field. In conventional general relativity, by contrast, the metric is taken to be the fundamental field, rather than the curvature. Another difference between quantum field theories and general relativity is that the Lagrangian of quantum field theories is taken to be quadratic in the field, whereas the Lagrangian of general relativity is taken to be linear in the curvature (specifically, the general relativity Lagrangian is the Ricci scalar R).

2.13 Symmetries of the Riemann tensor In a locally inertial frame, the connection coefficients all vanish, Γλµν = 0, but their partial derivatives, which are proportional to second derivatives of the metric tensor, do not vanish. Thus in a locally inertial frame the Riemann tensor is ∂Γµνλ ∂Γµνκ − Rκλµν = κ ∂x ∂xλ  2  1 ∂ 2 gµλ ∂ 2 gνλ ∂ 2 gµν ∂ 2 gµκ ∂ 2 gνκ ∂ gµν = + − − − + 2 ∂xκ ∂xλ ∂xκ ∂xν ∂xκ ∂xµ ∂xλ ∂xκ ∂xλ ∂xν ∂xλ ∂xµ  2  ∂ 2 gνλ ∂ 2 gµκ ∂ 2 gνκ ∂ gµλ 1 − κ µ − λ ν + λ µ . = (2.77) κ ν 2 ∂x ∂x ∂x ∂x ∂x ∂x ∂x ∂x You can check that the bottom line of equation (2.77): 1. is antisymmetric in κ ↔ λ, 2. is antisymmetric in µ ↔ ν, 3. is symmetric in κλ ↔ µν, 4. has the property that the sum of the cyclic permutations of the last three indices vanishes Rκλµν + Rκνλµ + Rκµνλ = 0 .

(2.78)

The first three of these four symmetries can be summarized by the shorthand notation R([κλ][µν])

(2.79)

2.14 Ricci tensor, Ricci scalar

55

in which [ ] denotes anti-symmetrization and ( ) symmetrization. These symmetries imply that the Riemann tensor is a symmetric matrix of antisymmetric matrices. An antisymmetric matrix has 6 degrees of freedom. A symmetric matrix of these things is a 6 × 6 symmetric matrix, which has 21 degrees of freedom. The final, cyclic symmetry of the Riemann tensor, equation (2.78), removes 1 degree of freedom. Thus the Riemann tensor has a net 20 degrees of freedom. Although the above symmetries were derived in a locally inertial frame, the fact that the Riemann tensor is a tensor means that the symmetries hold in any frame. If you prefer, you can add back the products of connection coefficients in equation (2.75), and check that the claimed symmetries remain.

2.14 Ricci tensor, Ricci scalar The Ricci tensor Rκµ and Ricci scalar R are the essentially unique contractions of the Riemann curvature tensor. The Ricci tensor, the compressive part of the Riemann tensor, is Rκµ ≡ g λν Rκλµν

is a coordinate tensor .

(2.80)

The symmetries of the Riemann tensor imply that the Ricci tensor is symmetric Rκµ = Rµκ

(2.81)

and therefore has 10 independent components. The Ricci scalar is R ≡ g κµ Rκµ

is a coordinate tensor (a scalar) .

(2.82)

2.15 Einstein tensor The Einstein tensor Gκµ is defined by Gκµ ≡ Rκµ −

1 2

gκµ R

is a coordinate tensor .

(2.83)

The symmetry of the Ricci and metric tensors imply that the Einstein tensor is likewise symmetric Gκµ = Gµκ .

(2.84)

The Einstein tensor has 10 independent components.

2.16 Bianchi identities The Jacobi identity [Dκ , [Dλ , Dµ ]] + [Dλ , [Dµ , Dκ ]] + [Dµ , [Dκ , Dλ ]] = 0

(2.85)

56

Fundamentals of General Relativity

implies the Bianchi identities Dκ Rλµνπ + Dλ Rµκνπ + Dµ Rκλνπ = 0

(2.86)

D[κ Rλµ]νπ = 0 .

(2.87)

which can be written in shorthand

The Bianchi identities constitute a set of differential relations between the components of the Riemann tensor, which are distinct from the algebraic symmetries of the Riemann tensor. There are 20 independent Bianchi identities. If just the symmetries (2.79) of the Riemann tensor are taken into account, then there are 24 identities; but the cyclic symmetry (2.78) eliminates 4, leaving 20 independent identities.

2.17 Covariant conservation of the Einstein tensor The most important consequence of the Bianchi identities (2.87) is obtained from the double contraction g κν g λπ (Dκ Rλµνπ + Dλ Rµκνπ + Dµ Rκλνπ ) = −Dκ Rκµ − Dλ Rλµ + Dµ R = 0

(2.88)

which implies that Dκ Gκµ = 0 .

(2.89)

This equation is a primary motivation for the form of the Einstein equations, since it implies energymomentum conservation, equation (2.91).

2.18 Einstein equations Einstein’s equations are Gκµ = 8πGTκµ

is a coordinate tensor equation .

(2.90)

What motivates the form of Einstein’s equations? 1. The equation is generally covariant; 2. The Bianchi identities guarantee conservation of energy-momentum; 3. The Einstein tensor depends on the lowest (second) order derivatives of the metric tensor that do not vanish in a locally inertial frame; The covariant conservation of the Einstein tensor, equation (2.89), implies the conservation of energymomentum Dκ Tκµ = 0 .

(2.91)

Einstein’s equations (2.90) constitute a complete set of gravitational equations, generalizing Poisson’s

2.19 Summary of the path from metric to the energy-momentum tensor

57

equation of Newtonian gravity. However, Einstein’s equations by themselves do not constitute a closed set of equations: in general, other equations, such as Maxwell’s equations of electromagnetism, and equations describing the microphysics of the energy-momentum, must be adjoined to form a closed set.

2.19 Summary of the path from metric to the energy-momentum tensor 1. 2. 3. 4.

Start by defining the metric gµν . Compute the connection coefficients Γλµν from equation (2.57). Compute the Riemann tensor Rκλµν from equation (2.75). Compute the Ricci tensor Rκµ from equation (2.80), the Ricci scalar R from equation (2.82), and the Einstein tensor Gκµ from equation (2.83). 5. The Einstein equations (2.90) then imply the energy-momentum tensor Tκµ . The path from metric to energy-momentum tensor is straightforward to program on a computer, but the results are typically messy and complicated, even for fairly simple spacetimes. Inverting the path to recover the metric from a given energy-momentum content is typically highly non-trivial, the subject of a huge literature. The great majority of metrics gµν yield an energy-momentum tensor Tκµ that cannot be achieved with normal matter.

2.20 Energy-momentum tensor of an ideal fluid The simplest non-trival energy-momentum tensor is that of the locally inertial rest frame of the fluid, taking the form  ρ 0 0  0 p 0 µν T =  0 0 p 0 0 0

where

ρ p

an ideal fluid. In this case T µν is isotropic in  0 0   0  p

is the proper mass-energy density , is the proper pressure .

(2.92)

(2.93)

The expression (2.92) is valid only in the locally inertial rest frame of the fluid. An expression that is valid in any frame is T µν = (ρ + p)uµ uν + p g µν ,

(2.94)

where uµ is the 4-velocity of the fluid. Equation (2.94) is valid because it is a tensor equation, and it is true in the locally inertial rest frame, where uµ = {1, 0, 0, 0}.

58

Fundamentals of General Relativity

2.21 Newtonian limit The Newtonian limit is obtained in the limit of a weak gravitational field and non-relativistic (pressureless) matter. In Cartesian coordinates, the metric in the Newtonian limit is ds2 = − (1 + 2Φ)dt2 + (1 − 2Φ)(dx2 + dy 2 + dz 2 ) ,

(2.95)

Φ(x, y, z) = Newtonian potential

(2.96)

in which

is a function only of the spatial coordinates x, y, z, not of time t. For this metric, to first order in the potential Φ the only non-vanishing component of the Einstein tensor is the time-time component Gtt = 2∇2 Φ ,

(2.97)

where ∇2 = ∂ 2 /∂x2 + ∂ 2 /∂y 2 + ∂ 2 /∂z 2 is the usual 3D Laplacian operator. This component (2.97) of the Einstein tensor plugged into Einstein’s equations (2.90) implies Poisson’s equation (2.4).

3 ∗

More on the coordinate approach

3.1 Weyl tensor The trace-free, or tidal, part of the Riemann curvature tensor defines the Weyl tensor Cκλµν Cκλµν ≡ Rκλµν −

1 2

(gκµ Rλν − gκν Rλµ + gλν Rκµ − gλµ Rκν ) +

1 6

(gκµ gλν − gκν gλµ ) R is a coordinate tensor .

(3.1) The Weyl tensor has 10 independent components. These 10 components together with the 10 components of the Ricci tensor account for the 20 components of the Riemann tensor. The Weyl tensor inherits the symmetries (2.79) of the Riemann tensor C([κλ][µν]) .

(3.2)

Whereas the Einstein tensor Gκµ , necessarily vanishes in a region of spacetime where there is no energymomentum, Tκµ = 0, the Weyl tensor does not. The Weyl tensor expresses the presence of tidal gravitational forces, and of gravitational waves.

3.2 Evolution equations for the Weyl tensor This section is included because (a) the comparison to Maxwell’s equations is neat and insightful, (b) it helps to account for the degrees of freedom of the gravitational field, (c) it shows how the Weyl tensor encodes gravitational waves. Contracted on one index, the Bianchi identities (2.86) are D[κ Rλµ]ν κ = 0 .

(3.3)

There are 20 such independent contracted identities. Since this is the same as the number of independent Bianchi identities, it follows that the contracted Bianchi identities (3.3) are equivalent to the full set of Bianchi identities (2.87). If the Riemann tensor is separated into its trace (Ricci) and traceless (Weyl) parts,



60

More on the coordinate approach

equation (3.1), then the contracted Bianchi identities (3.3) become the Weyl evolution equations Dκ Cκλµν = Jλµν ,

(3.4)

where Jλµν is the Weyl current Jλµν ≡

1 2

(Dµ Gλν − Dν Gλµ ) −

1 6

(gλν Dµ G − gλµ Dν G) .

(3.5)

The Weyl evolution equations (3.4) can be regarded as the gravitational analogue of Maxwell’s equations of electromagnetism. The Weyl current Jλµν is a vector of bivectors, which would suggest that it has 4 × 6 = 24 components, but it loses 4 of those components because of the cyclic identity (2.78), which implies the cyclic symmetry J[λµν] = 0 .

(3.6)

Thus Jλµν has 20 independent components, in agreement with the above assertion that there are 20 independent contracted Bianchi identities. Since the Weyl tensor is traceless, contracting the Weyl evolution equations (3.4) on λν yields zero on the left hand side, so that the contracted Weyl current satisfies J λ λµ = 0 .

(3.7)

This doubly-contracted Bianchi identity, which is the same as equation (2.89), enforces conservation of energy-momentum. Unlike the cyclic symmetries (3.6), which are automatically satisfied, equations (3.7) constitute a non-trivial set of 4 conditions on the Einstein tensor. Besides the algebraic relations (3.6) and (3.7), the Weyl current satisfies 6 differential identities comprising the conservation law Dλ Jλµν = 0

(3.8)

in view of equation (3.4) and the antisymmetry of Cκλµν with respect to the indices κλ. The Weyl current conservation law (3.8) follows automatically from the form (3.5) of the Weyl current, coupled with energymomentum conservation (2.89), so does not impose any additional non-trivial conditions on the Riemann tensor. The 4 relations (3.7) and the 6 identities (3.8) account for 10 of the 20 contracted Bianchi identities (3.3). The remaining 10 equations comprise Maxwell-like equations (3.4) for the evolution of the 10 components of the Weyl tensor. Whereas the Einstein equations relating the Einstein tensor to the energy-momentum tensor are postulated equations of general relativity, the 10 evolution equations for the Weyl tensor, and the 4 equations enforcing covariant conservation of the Einstein tensor, follow mathematically from the Bianchi identities, and do not represent additional assumptions of the theory. ⋄

Exercise 3.1

Confirm the counting of degrees of freedom.

Exercise 3.2 equation

From the Bianchi identities, show that the Riemann tensor satisfies the covariant wave Rκλµν = Dκ Dµ Rλν − Dκ Dν Rλµ + Dλ Dν Rκµ − Dλ Dµ Rκν ,

(3.9)

3.3 Geodesic deviation

61

where  is the D’Alembertian operator, the 4-dimensional wave operator  ≡ D π Dπ .

(3.10)

Show that contracting equation (3.9) with g λν yields the identity Rκµ = Rκµ . Conclude that the wave equation (3.9) is non-trivial only for the trace-free part of the Riemann tensor, the Weyl tensor Cκλµν . Show that the wave equation for the Weyl tensor is Cκλµν = (Dκ Dµ −

1 2

+ (Dλ Dν − 1 6

gκµ )Rλν − (Dκ Dν − 1 2

1 2

gλν )Rκµ − (Dλ Dµ −

+ (gκµ gλν − gκν gλµ )R .

gκν )Rλµ 1 2

gλµ )Rκν (3.11)

Conclude that in a vacuum, where Rκµ = 0, Cκλµν = 0 .

(3.12) ⋄

3.3 Geodesic deviation This section on geodesic deviation is included not because the equation of geodesic deviation is crucial to everyday calculations in general relativity, but rather for two reasons. First, the equation offers insight into the physical meaning of the Riemann tensor. Second, the derivation of the equation offers a fine illustration of the fact that in general relativity, whenever you take differences at infinitesimally separated points in space or time, you should always take covariant differences. Consider two objects that are free-falling along two infinitesimally separated geodesics. In flat space the acceleration between the two objects would be zero, but in curved space the curvature induces a finite acceleration between the two objects. This is how an observer can measure curvature, at least in principle: set up an ensemble of objects initially at rest a small distance away from the observer in the observer’s locally inertial frame, and watch how the objects begin to move. The equation (3.18) that describes this acceleration between objects an infinitesimal distance apart is called the equation of geodesic deviation. The covariant difference in the velocities of two objects an infinitesimal distance δxµ apart is Dδxµ = δuµ . Dτ

(3.13)

In general relativity, the ordinary difference between vectors at two points a small interval apart is not a physically meaningful thing, because the frames of reference at the two points are different. The only physically meaningful difference is the covariant difference, which is the difference in the two vectors paralleltransported across the gap between them. It is only this covariant difference that is independent of the frame of reference. On the left hand side of equation (3.13), the proper time derivative must be the covariant proper time derivative, D/Dτ = uλ Dλ . On the right hand side of equation (3.13), the difference in the 4-velocity

62



More on the coordinate approach

at two points δxκ apart must be the covariant difference δ = δxκ Dκ . Thus equation (3.13) means explicitly the covariant equation uλ Dλ δxµ = δxκ Dκ uµ .

(3.14)

To derive the equation of geodesic deviation, first vary the geodesic equation Duµ /Dτ = 0 (I’ve put the index µ downstairs so that the final equation (3.18) looks cosmetically better, but of course since everything is covariant the µ index could just as well be put upstairs everywhere): Duµ Dτ  = δxκ Dκ uλ Dλ uµ

0=δ

= δuλ Dλ uµ + δxκ uλ Dκ Dλ uµ .

(3.15) κ

On the second line, the covariant diffence δ between quantities a small distance δx apart has been set equal to δxκ Dκ , while D/Dτ has been set equal to the covariant time derivative uλ Dλ along the geodesic. On the last line, δxκ Dκ uλ has been replaced by δuµ . Next, consider the covariant acceleration of the interval δxµ , which is the covariant proper time derivative of the covariant velocity difference δuµ : D2 δxµ Dδuµ = Dτ 2 Dτ = uλ Dλ (δxκ Dκ uµ ) = δuκ Dκ uµ + δxκ uλ Dλ Dκ uµ .

(3.16)

As in the previous equation (3.15), on the second line D/Dτ has been set equal to uλ Dλ , while δ has been set equal to δxκ Dκ . On the last line, uλ Dλ δxκ has been set equal to δuµ , equation (3.14). Subtracting (3.15) from (3.16) gives D2 δxµ = δxκ uλ [Dλ , Dκ ]uµ , (3.17) Dτ 2 or equivalently D2 δxµ + Rκλµν δxκ uλ uν = 0 , Dτ 2

(3.18)

which is the desired equation of geodesic deviation.

3.4 Commutator of the covariant derivative revisited The commutator of the covariant derivative is of fundamental importance because it defines what is meant by the field in gauge theories. It was seen above that the commutator of the covariant derivative acting on a scalar defined the torsion tensor, equation (2.50), which general relativity assumes vanishes, while the commutator of the covariant derivative acting on a vector defined the Riemann tensor, equation (2.74). Does the commutator of the covariant derivative acting on a general tensor introduce any other distinct tensor? No:

3.4 Commutator of the covariant derivative revisited

63

the torsion and Riemann tensors completely define the action of the commutator of the covariant derivative on any tensor, equation (3.22). The general expression (3.22) for the commutator of the covariant derivative reveals the meaning of the torsion and Riemann tensors. The torsion and Riemann tensors describe respectively the displacement and the Lorentz transformation experienced by an object when parallel-transported around a curve. Displacement and Lorentz transformations together constitute the Poincar´e group, the complete group of symmetries of flat spacetime. How can an object detect a displacement when parallel-transported around a curve? If you go around a curve back to the same coordinate in spacetime where you began, won’t you necessarily be at the same position? This is a question that goes to heart of the meaning of spacetime. To answer the question, you have to consider how fundamental particles are able to detect position, orientation, and velocity. Classically, particles may be structureless points, but quantum mechanically, particles possess frequency, wavelength, spin, and (in the relativistic theory) boost, and presumably it is these properties that allow particles to “measure” the properties of the spacetime in which they live. Specifically, a Dirac spinor (relativistic spin- 21 particle) has 8 degrees of freedom, of which 6 define a Lorentz transformation (a Lorentz rotor, a member of the group of spin- 12 Lorentz transformations), and the remaining 2 comprise a complex number REALLY? THE COMPLEX NUMBER IS WITH RESPECT TO THE PSEUDOSCALAR I ∼ e−ip·x whose phase encodes the displacement. Thus a Dirac spinor could potentially detect a displacement through a change in its phase when parallel-transported around a curve back to the same point in spacetime. General relativity, which assumes that torsion vanishes, asserts that there is no such change of phase. In the presence of torsion, the expression for the connection coefficients Γλµν is, from equation (2.57), Γλµν =

1 2



∂gλµ ∂gλν ∂gµν + − + Sλµν + Sµνλ + Sνµλ ∂xν ∂xµ ∂xλ



.

(3.19)

The first part 12 (gλµ,ν + gλν,µ − gµν,λ ) of this expression is called the Christoffel symbol of the first kind [the same thing with the first index raised, 12 g κλ (gλµ,ν + gλν,µ − gµν,λ ), is called the Christoffel symbol of the second kind], while the second part 12 (Sλµν + Sµνλ + Sνµλ ) is called the contortion (not contorsion!) tensor. There’s no need to remember the crazy jargon, but in case you meet it, that’s what it means. If torsion does not vanish, then the commutator of the covariant derivative acting on a contravariant 4-vector is ν [Dκ , Dλ ] Aµ = Sκλ Dν Aµ + Rκλµν Aν

is a coordinate tensor

(3.20)

where the Riemann tensor Rκλµν is given in terms of the connection coefficients by the same formula (2.75) as before, but the connection coefficients Γλµν themselves are given by (3.19). The Riemann tensor is still antisymmetric in each of κ ↔ λ and µ ↔ ν, but with torsion it is no longer symmetric in κλ ↔ µν. In other words, the symmetries of the Riemann tensor with torsion are R[κλ][µν] .

(3.21)

As a matrix of antisymmetric tensors, the Riemann tensor with torsion has 6 × 6 = 36 degrees of freedom.

64



More on the coordinate approach

Because the Riemann tensor Rκλµν is no longer symmetric in κλ ↔ µν, the Ricci tensor Rκµ ≡ g λν Rκλµν is no longer symmetric, and likewise the Einstein tensor Gκµ ≡ Rκµ − 12 Rgκµ is no longer symmetric. Evidently the antisymmetric part of the Einstein tensor depends on torsion. Acting on a general tensor, the commutator of the covariant derivative is α πρ... α πρ... α πρ... π αρ... ρ πα... [Dκ , Dλ ] Aπρ... µν... = Sκλ Dα Aµν... + Rκλµ Aαν... + Rκλν Aµα... − Rκλα Aµν... − Rκλα Aµν... .

(3.22)

In more abstract notation, the commutator of the covariant derivative is the operator [Dκ , Dλ ] = Sκλ · D + Rκλ

(3.23)

µ where Sκλ ≡ gµ Sκλ and D ≡ g µ Dµ , and the Riemann curvature operator Rκλ is an operator whose action on any tensor is specified by equation (3.22). The action of the operator Rκλ is analogous to that of the covariant derivative (2.40): there’s a positive R term for each covariant index, and a negative R term for each contravariant index. The action of Rκλ on a scalar is zero, which reflects the fact that a scalar is unchanged by a Lorentz transformation.

4 ∗

Action principle

Hamilton’s principle of least action postulates that any dynamical system is characterized by a scalar action S, which has the property that when the system evolves from one specified state to another, the path by which it gets between the two states is such as to minimize the action. The action need not be a global minimum, just a local minimum with respect to small variations in the path between fixed initial and final states. That nature appears to respect the principle of least action is of the profoundest significance.

λ

x1 x0 Figure 4.1 The action principle considers various paths through spacetime between fixed initial and final conditions, and chooses that path that minimizes the action.

66



Action principle

4.1 Principle of least action for point particles The path of a point particle through spacetime is specified by its coordinates xµ (λ) as a function of some arbitrary parameter λ. In non-relativistic mechanics it is usual to take the parameter λ to be the time t, and the path of a particle through space is then specified by three spatial coordinates xi (t). In relativity however it is more natural to treat the time and space coordinates on an equal footing, and regard the path of a particle as being specified by four spacetime coordinates xµ (λ) as a function of an arbitrary parameter λ. The parameter λ is simply a continuous parameter that labels points along the path, and has no physical significance (for example, it is not necessarily an affine parameter). The path of a system of N point particles through spacetime is specified by 4N coordinates xµ (λ). The action principle postulates that, for a system of N point particles, the action S is an integral of a Lagrangian L(xµ , dxµ /dλ) which is a function of the 4N coordinates xµ (λ) together with the 4N velocities dxµ /dλ with respect to the arbitrary parameter λ. The action from an initial state at λi to a final state at λf is thus  Z λf  µ µ dx L x , S= dλ . (4.1) dλ λi The principle of least action demands that the actual path taken by the system between given initial and final coordinates xµi and xµf is such as to minimize the action. Thus the variation δS of the action must be zero under any change δxµ in the path, subject to the constraint that the coordinates at the endpoints are fixed, δxµi = 0 and δxµf = 0,  Z λf  ∂L ∂L µ µ δ(dx /dλ) dλ = 0 . δx + (4.2) δS = ∂xµ ∂(dxµ /dλ) λi The change in the velocity along the path is just the velocity of the change, δ(dxµ /dλ) = d(δxµ )/dλ. Integrating the second term in the integrand of equation (4.2) by parts yields λf Z λf    ∂L d ∂L ∂L µ + − (4.3) δxµ dλ = 0 . δx δS = µ µ /dλ) ∂(dxµ /dλ) ∂x dλ ∂(dx λi λi The surface term in equation (4.3) vanishes, since by hypothesis the coordinates are held fixed at the end points, so δxµ = 0 at the end points. Therefore the integral in equation (4.3) must vanish. Indeed least action requires the integral to vanish for all possible variations δxµ in the path. The only way this can happen is that the integrand must be identically zero. The result is the Euler-Lagrange equations of motion d ∂L ∂L =0 . − dλ ∂(dxµ /dλ) ∂xµ

(4.4)

It might seem that the Euler-Lagrange equations (4.4) are inadequately specified, since they depend on some arbitrary unknown parameter λ. But in fact the Euler-Lagrange equations are the same regardless of the choice of λ. An example of the irrelevance of λ will be seen in the next section, §4.2. Since λ can be chosen arbitrarily, it is usual to choose it in some convenient fashion. For a massive particle, λ can be taken

4.2 Action for a test particle

67

equal to the proper time τ of the particle. For a massless particle, whose proper time never progresses, λ can be taken equal to an affine parameter.

4.2 Action for a test particle According to the principle of equivalence, a test particle in a gravitating system moves along a geodesic, a straight line relative to local free-falling frames. A geodesic is the shortest distance between two points. In relativity this translates, for a√massive particle, into the longest proper time between two points. The proper p 2 time along any path is dτ = −ds = −gµν dxµ dxν . Thus the action Sm of a test particle of rest mass m in a gravitating system is Z λf r Z λf dxµ dxν dλ . (4.5) dτ = −m −gµν Sm = −m dλ dλ λi λi The factor of rest mass m brings the action, which has units of angular momentum, to standard normalization. The overall minus sign comes from the fact that the action is a minimum whereas the proper time is a maximum along the path. The action principle requires that the Lagrangian be written as a function of the coordinates xµ and velocities dxν /dλ, and it is seen that the integrand in the last expression of equation (4.5) has the desired form, the metric gµν being considered a given function of the coordinates. Thus the Lagrangian Lm of a test particle of mass m is r dxµ dxν Lm = −m −gµν . (4.6) dλ dλ The partial derivatives that go in the Euler-Lagrange equations (4.4) are then dxν ∂Lm dλ = −m p , ∂(dxκ /dλ) −gπρ (dxπ /dλ)(dxρ /dλ) 1 ∂gµν dxµ dxν − ∂Lm 2 dxκ dλ dλ = −m p . ∂xκ −gπρ (dxπ /dλ)(dxρ /dλ) −gκν

(4.7a)

(4.7b)

The denominators in the expressions (4.7) for the partial derivatives of the Lagrangian are p −gπρ (dxπ /dλ)(dxρ /dλ) = dτ /dλ. It was not legitimate to make this substitution before taking the partial derivatives, since the Euler-Lagrange equations require that the Lagrangian be expressed in terms of xµ and dxµ /dλ, but it is fine to make the substitution now that the partial derivatives have been obtained. The partial derivatives (4.7) thus simplify to ∂Lm dxν dλ = mg = muκ , κν ∂(dxκ /dλ) dλ dτ 1 ∂gµν dxµ dxν dλ ∂Lm dτ = m = mΓµνκ uµ uν , ∂xκ 2 dxκ dλ dλ dτ dλ

(4.8) (4.9)



68

Action principle

in which uκ ≡ dxκ /dτ is the usual 4-velocity, and the derivative of the metric has replaced by connections in accordance with equation (2.54). The resulting Euler-Lagrange equations of motion (4.4) are dmuκ dτ = mΓµνκ uµ uν . (4.10) dλ dλ As remarked in §4.1, the choice of the arbitrary parameter λ has no effect on the equations of motion. With a factor of m dτ /dλ cancelled, and with the derivative converted to a covariant derivative by (2.39), equation (4.10) becomes Duκ = Sµνκ uµ uν , (4.11) Dτ where Sµνκ is the torsion, equation (2.51). If torsion vanishes, as general relativity assumes, then the result is the usual equation of geodesic motion Duκ =0. (4.12) Dτ The fact that motion is geodesic only if torsion vanishes is to be expected, since, as argued in §2.6.7, space is locally inertial only if torsion vanishes.

4.3 Action for a charged test particle in an electromagnetic field Aim is to reproduce the Lorentz force law. S = Sm + Sq

(4.13)

where Sq = q

Z

λf

Aµ dxµ = q

λi

Z

λf



λi

dxµ dλ . dλ

(4.14)

The Lagrangian is therefore Lq = qAµ

dxµ . dλ

(4.15)

Partial derivatives are ∂Lq = qAκ , ∂(dxκ /dλ)

∂Lq ∂Aµ dxµ ∂Aµ dτ =q κ = q κ uµ . κ ∂x ∂x dλ ∂x dλ

Applied to the Lagrangian L = Lm + Lq , the Euler-Lagrange equations (4.4) are   dτ d ∂Aµ (muκ + qAκ ) = mΓµνκ uµ uν + q κ uµ . dλ ∂x dλ

(4.16a)

(4.17)

If torsion vanishes, as general relativity assumes, then the result is the Lorentz force law for a test particle of mass m and charge q moving in a prescribed gravitational and electromagnetic field Dmuκ = qFµκ uµ . Dτ

(4.18)

4.4 Generalized momentum Exercise 4.1

69

Show that if torsion does not vanish, then the Lorentz force law becomes Dmuκ = [qFµκ + Sνµκ (muν + qAν )] uµ . Dτ

(4.19) ⋄

[Hint: Recall the relations (2.51) and (2.52).]

4.4 Generalized momentum The generalized momentum πκ ≡

∂L . ∂(dxκ /dλ)

(4.20)

The generalized momentum πκ of a test particle coincides with the ordinary momentum pκ : πκ = pκ = muκ .

(4.21)

The generalized momentum of a test particle of charge q in an electromagnetic field of potential Aκ πκ = pκ + qAκ .

(4.22)

4.5 Hamiltonian Work with coordinates and generalized momenta instead of coordinates and velocities. Define Hamiltonian H by H ≡ πµ S=

dxµ −L dλ

 Z  dxµ πµ − H dλ dλ

    µ Z   ∂H ∂H dπµ dx µ + µ δx + − δπµ dλ − δS = [πµ δx ] + dλ ∂x dλ ∂π µ µ

dπµ ∂H =− µ , dλ ∂x What is this useful for?

dxµ ∂H . = dλ ∂π µ

(4.23)

(4.24)

(4.25)

(4.26)

70



Action principle

4.6 Derivatives of the action Besides being a scalar whose minimum value between fixed endpoints defines the path between those points, the action can also be treated as a function of its endpoints along the actual path. Along the actual path, the equations of motion are satisfied, so the integral in the variation (4.3) of the action vanishes identically. The surface term in the variation (4.3) then implies that δS = πµ δxµ , so that the partial derivatives of the action with respect to the coordinates are equal to the generalized momenta, ∂S = πµ . ∂xµ This is the basis of the Hamilton-Jacobi method.

(4.27)

PART THREE IDEAL BLACK HOLES

Concept Questions

1. 2. 3. 4. 5. 6. 7.

8. 9.

10. 11. 12. 13. 14. 15. 16. 17. 18.

What evidence do astronomers currently accept as indicating the presence of a black hole in a system? Why can astronomers measure the masses of supermassive black holes only in relatively nearby galaxies? To what extent (with what accuracy) are real black holes in our Universe described by the no-hair theorem? Does the no-hair theorem apply inside a black hole? Black holes lose their hair on a light-crossing time. How long is a light-crossing time for a typical stellarsized or supermassive astronomical black hole? Relativists say that the metric is gµν , but they also say that the metric is ds2 = gµν dxµ dxν . How can both statements be correct? The Schwarzschild geometry is said to describe the geometry of spacetime outside the surface of the Sun or Earth. But the Schwarzschild geometry supposedly describes non-rotating masses, whereas the Sun and Earth are rotating. If the Sun or Earth collapsed to a black hole conserving their mass M and angular momentum L, roughly what would the spin a/M = L/M 2 of the black hole be relative to the maximal spin a/M = 1 of a Kerr black hole? What happens at the horizon of a black hole? As cold matter becomes denser, it goes through the stages of being solid/liquid like a planet, then electron degenerate like a white dwarf, then neutron degenerate like a neutron star, then finally it collapses to a black hole. Why could there not be a denser state of matter, denser than a neutron star, that brings a star to rest inside its horizon? How can an observer determine whether they are “at rest” in the Schwarzschild geometry? An observer outside the horizon of a black hole never sees anything pass through the horizon, even to the end of the Universe. Does the black hole then ever actually collapse, if no one ever sees it do so? If nothing can ever get out of a black hole, how does its gravity get out? Why did Einstein believe that black holes could not exist in nature? In what sense is a rotating black hole “stationary”, but not “static”? What is a white hole? Do they exist? Could the expanding Universe be a white hole? Could the Universe be the interior of a black hole? You know the Schwarzschild metric for a black hole. What is the corresponding metric for a white hole?

74

Concept Questions

19. What is the best kind of black hole to fall into if you want to avoid being tidally torn apart? 20. Why do astronomers often assume that the inner edge of an accretion disk around a black hole occurs at the innermost stable orbit? 21. A collapsing star of uniform density has the geometry of a collapsing Friedmann-Robertson-Walker cosmology. If a spatially flat FRW cosmology corresponds to a star that starts from zero velocity at infinity, then to what do open or closed FRW cosmologies correspond? 22. Is the singularity of a Reissner-Nordstr¨ om black hole gravitationally attractive or repulsive? 23. If you are a charged particle, which dominates near the singularity of the Reissner-Nordstr¨ om geometry, the electrical attraction/repulsion or the gravitational attraction/repulsion? 24. Is a white hole gravitationally attractive or repulsive? 25. What happens if you fall into a white hole? 26. Which way does time go in Parallel Universes in the Reissner-Nordstr¨ om geometry? 27. What does it mean that geodesics inside a black hole can have negative energy? 28. Can geodesics have negative energy outside a black hole? How about inside the ergosphere? 29. Physically, what causes mass inflation? 30. Is mass inflation likely to occur inside real astronomical black holes? 31. What happens at the X point, where the ingoing and outgoing inner horizons of the Reissner-Nordstr¨ om geometry intersect? 32. Can a particle like an electron or proton, whose charge far exceeds its mass (in geometric units), be modeled as Reissner-Nordstr¨ om black hole? 33. Does it makes sense that a person might be at rest in the Kerr-Newman geometry? How would the Boyer-Linquist coordinates of such a person vary along their worldline? 34. In identifying M as the mass and a the angular momentum per unit mass of the black hole in the BoyerLinquist metric, why is it sufficient to consider the behaviour of the metric at r → ∞? 35. Does space move faster than light inside the ergosphere? 36. If space moves faster than light inside the ergosphere, why is the outer boundary of the ergosphere not a horizon? 37. Do closed timelike curves make sense? 38. What does Carter’s fourth integral of motion Q signify physically? 39. What is special about a principal null congruence? 40. Evaluated in the locally inertial frame of a principal null congruence, the spin-0 component of the Weyl scalar of the Kerr geometry is C = −M/(r − ia cos θ)3 , which looks like the Weyl scalar C = −M/r3 of the Schwarzschild geometry but with radius r replaced by the complex radius r − ia cos θ. Is there something deep here? Can the Kerr geometry be constructed from the Schwarzschild geometry by complexifying the radial coordinate r?

What’s important?

1. 2. 3. 4. 5.

Astronomical evidence suggests that stellar-sized and supermassive black holes exist ubiquitously in nature. The no-hair theorem, and when and why it applies. The physical picture of black holes as regions of spacetime where space is falling faster than light. A physical understanding of how the metric of a black hole relates to its physical properties. Penrose (conformal) diagrams. In particular, the Penrose diagrams of the various kinds of vacuum black hole: Schwarzschild, Reissner-Nordstr¨ om, Kerr-Newman. 6. What really happens inside black holes. Collapse of a star. Mass inflation instability.

5 Observational Evidence for Black Holes

It is beyond the scope of this course to discuss the observational evidence for black holes in any detail. However, it is useful to summarize a few facts. 1. Observational evidence supports the idea that black holes occur ubiquitously in nature. They are not observed directly, but reveal themselves through their effects on their surroundings. Two kinds of black hole are observed: stellar-sized black holes in x-ray binary systems, mostly in our own Milky Way galaxy, and supermassive black holes in Active Galactic Nuclei (AGN) found at the centers of our own and other galaxies. 2. The primary evidence that astronomers accept as indicating the presence of a black hole is a lot of mass compacted into a tiny space. a. In an x-ray binary system, if the mass of the compact object exceeds 3 M⊙ , the maximum theoretical mass of a neutron star, then the object is considered to be a black hole. Many hundreds of x-ray binary systems are known in our Milky Way galaxy, but only 10s of these have measured masses, and in about 20 the measured mass indicates a black hole. b. Several tens of thousands of AGN have been cataloged, identified either in the radio, optical, or x-rays. But only in nearby galaxies can the mass of a supermassive black hole be measured directly. This is because it is only in nearby galaxies that the velocities of gas or stars can be measured sufficiently close to the nuclear center to distinguish a regime where the velocity becomes constant, so that the mass can be attribute to an unresolved central point as opposed to a continuous distribution of stars. The masses of about 40 supermassive black holes have been measured in this way. The masses range from the 4 × 106 M⊙ mass of the black hole at the center of the Milky Way to the 3 × 109 M⊙ mass of the black hole at the center of the M87 galaxy at the center of the Virgo cluster at the center of the Local Supercluster of galaxies. 3. Secondary evidences for the presence of a black hole are: a. b. c. d.

high luminosity; non-stellar spectrum, extending from radio to gamma-rays; rapid variability. relativistic jets.

Observational Evidence for Black Holes

4.

5.

6.

7.

8.

77

Jets in AGN are often one-sided, and a few that are bright enough to be resolved at high angular resolution show superluminal motion. Both evidences indicate that jets are commonly relativistic, moving at close to the speed of light. There are a few cases of jets in x-ray binary systems. Stellar-sized black holes are thought to be created in supernovae as the result of the core-collapse of stars more massive than about 25 M⊙ (this number depends in part on uncertain computer simulations). Supermassive black holes are probably created initially in the same way, but they then grow by accretion of gas funnelled to the center of the galaxy. The growth rates inferred from AGN luminosities are consistent with this picture. Long gamma-ray bursts (lasting more than about 2 seconds) are associated observationally with supernovae. It is thought that in such bursts we are seeing the formation of a black hole. As the black hole gulps down the huge quantity of material needed to make it, it regurgitates a relativistic jet that punches through the envelope of the star. If the jet happens to be pointed in our direction, then we see it relativistically beamed as a gamma-ray burst. Astronomical black holes present the only realistic prospect for testing general relativity in the strong field regime, since such fields cannot be reproduced in the laboratory. At the present time the observational tests of general relativity from astronomical black holes are at best tentative. One test is the redshifting of 7 keV iron lines in a small number of AGN, notably MCG-6-30-15, which can be interpreted as being emitted by matter falling on to a rotating (Kerr) black hole. At present, no gravitational waves have been definitely detected from anything. In the future, gravitational wave astronomy should eventually detect the merger of two black holes. If the waveforms of merging black holes are consistent with the predictions of general relativity, it will provide a far more stringent test of strong field general relativity than has been possible to date. Although gravitational waves have yet to be detected directly, their existence has been inferred from the gradual speeding up of the orbit of the Hulse-Taylor binary, which consists of two neutron stars, one of which, PSR1913+16, is a pulsar. The parameters of the orbit have been measured with exquisite precision, and the rate of orbital speed-up is in good agreement with the energy loss by quadrupole gravitational wave emission predicted by general relativity.

6 Ideal Black Holes

6.1 Definition of a black hole What is a black hole? Doubtless you have heard the standard definition many times: It is a region whose gravity is so strong that not even light can escape. But why can light not escape from a black hole? A standard answer, which John Michell (1784, Phil. Trans. Roy. Soc. London 74, 35) would have found familiar, is that the escape velocity exceeds the speed of light. But that answer brings to mind a Newtonian picture of light going up, turning around, and coming back down, that is altogether different from what general relativity actually predicts. A better definition of a black hole is that it is a region where space is falling faster than light. Inside the horizon, light emitted outwards is carried inward by the faster-than-light inflow of space, like a fish trying but failing to swim up a waterfall. The definition may seem jarring. If space has no substance, how can it fall faster than light? It means that inside the horizon any locally inertial frame is compelled to fall to smaller radius as its proper time goes by. This fundamental fact is true regardless of the choice of coordinates. A similar concept of space moving arises in cosmology. Astronomers observe that the Universe is expanding. Cosmologists find it convenient to conceptualize the expansion by saying that space itself is expanding. For example, the picture that space expands makes it more straightforward, both conceptually and mathematically, to deal with regions of spacetime beyond the horizon, the surface of infinite redshift, of an observer.

6.2 Ideal black hole The simplest kind of black hole, an ideal black hole, is one that is stationary, electrovac outside its singularity, and extends to asymptotically flat empty space at infinity. Electrovac means that the energy-momentum tensor Tµν is zero except for the contribution from a stationary electromagnetic field.

6.3 No-hair theorem

79

The next several chapters deal with ideal black holes. The importance of ideal black holes stems from the no-hair theorem, discussed in the next section. The no-hair theorem has the consequence that, except during their initial collapse, or during a merger, real astronomical black holes are accurately described as ideal outside their horizons.

6.3 No-hair theorem I will state and justify the no-hair theorem, but I will not prove it mathematically, since the proof is technical. The no-hair theorem states that a stationary black hole in asymptotically flat space is characterized by just three quantities: 1. Mass M ; 2. Electric charge Q; 3. Spin, usually parameterized by the angular momentum a per unit mass. The mechanism by which a black hole loses its hair is gravitational radiation. When initially formed, whether from the collapse of a massive star or from the merger of two black holes, a black hole will form a complicated, oscillating region of spacetime. But over the course of several light crossing times, the oscillations lose energy by gravitational radiation, and damp out, leaving a stationary black hole. Real astronomical black holes are not isolated, and continue to accrete (cosmic microwave background photons, if nothing else). However, the timescale (a light crossing time) for oscillations to damp out by gravitational radiation is usually far shorter than the timescale for accretion, so in practice real black holes are extremely well described by no-hair solutions almost all of their lives. The physical reason that the no-hair theorem applies is that space is falling faster than light inside the horizon. Consequently, unlike a star, no energy can bubble up from below to replace the energy lost by gravitational radiation, so that the black hole tends to the lowest energy state characterized by conserved quantities. As a corollary, the no-hair theorem does not apply from the inner horizon of a black hole inward, because there space ceases to fall superluminally. If there exist other absolutely conserved quantities, such as magnetic charge (magnetic monopoles), or various supersymmetric charges in theories where supersymmetry is not broken, then the black hole will also be characterized by those quantities. Black holes are expected not to conserve quantities such as baryon or lepton number that are thought not to be absolutely conserved, even though they appear to be conserved in low energy physics. Other stationary solutions exist that describe black holes in spacetimes that are not asymptotically flat, such as spacetimes with a cosmological constant, or with a uniform electromagnetic field. It is legitimate to think of the process of reaching a stationary state as analogous to reaching a condition of thermodynamic equilibrium, in which a macroscopic system is described by a small number of parameters associated with the conserved quantities of the system.

7 Schwarzschild Black Hole

The Schwarzschild geometry was discovered by Karl Schwarzschild in late 1915 at essentially the same time that Einstein was arriving at his final version of the General Theory of Relativity.

7.1 Schwarzschild metric The Schwarzschild metric is, in a polar coordinate system {t, r, θ, φ}, and in geometric units c = G = 1, ds2 = −



1−

2M r



 −1 2M dt2 + 1 − dr2 + r2 do2 , r

(7.1)

where do2 (this is the Landau & Lifshitz notation) is the metric of a unit 2-sphere do2 = dθ2 + sin2 θ dφ2 .

(7.2)

The Schwarzschild geometry describes the simplest kind of black hole: a black hole with mass M , but no electric charge, and no spin. The geometry describes not only a black hole, but also any empty space surrounding a spherically symmetric mass. Thus the Schwarzschild geometry describes to a good approximation the spacetime outside the surfaces of the Sun and the Earth. Comparison with the spherically symmetric Newtonian metric ds2 = − (1 + 2Φ)dt2 + (1 − 2Φ)(dr2 + r2 do2 )

(7.3)

with Newtonian potential Φ(r) = −

M r

(7.4)

establishes that the M in the Schwarzschild metric is to be interpreted as the mass of the black hole. The Schwarzschild geometry is asymptotically flat, because the metric tends to the Minkowski metric in

7.2 Birkhoff ’s theorem

81

polar coordinates at large radius ds2 → − dt2 + dr2 + r2 do2

as r → ∞ .

(7.5)

Exercise 7.1 The Schwarschild metric (7.1) does not have the same form as the spherically symmetric Newtonian metric (7.3). By a suitable transformation of the radial coordinate r, bring the Schwarschild metric (7.1) to the isotropic form  2 1 − M/2R ds2 = − dt2 + (1 + M/2R)4 (dR2 + R2 do2 ) . (7.6) 1 + M/2R What is the relation between R and r? Hence conclude that the identification (7.4) is correct, and therefore that M is indeed the mass of the black hole. Is the isotropic form (7.6) of the Schwarzschild metric valid inside the horizon?

7.2 Birkhoff’s theorem Birkhoff ’s theorem states that the geometry of empty space surrounding a spherically symmetric matter distribution is the Schwarzschild geometry. That is, if the metric is of the form ds2 = A(t, r) dt2 + B(t, r) dt dr + C(t, r) dr2 + D(t, r) do2 ,

(7.7)

where the metric coefficients A, B, C, and D are allowed to be arbitrary functions of t and r, and if the energy momentum tensor vanishes, Tµν = 0, outside some value of the circumferential radius r′ defined by r′2 = D, then the geometry is necessarily Schwarzschild outside that radius. This means that if a mass undergoes spherically symmetric pulsations, then those pulsations do not affect the geometry of the surrounding spacetime. This reflects the fact that there are no spherically symmetric gravitational waves.

7.3 Stationary, static The Schwarzschild geometry is stationary. A spacetime is said to be stationary if and only if there exists a timelike coordinate t such that the metric is independent of t. In other words, the spacetime possesses time translation symmetry: the metric is unchanged by a time translation t → t + t0 where t0 is some constant. Evidently the Schwarzschild metric (7.1) is independent of the timelike coordinate t, and is therefore stationary, time translation symmetric. The Schwarzschild geometry is also static. A spacetime is static if and only if the coordinates can be chosen so that, in addition to being stationary with respect to a time coordinate t, the spatial coordinates

82

Schwarzschild Black Hole

do not change along the direction of the tangent vector gt . This requires that the tangent vector gt be orthogonal to all the spatial tangent vectors gt · gµ = gtµ = 0 for µ 6= t .

(7.8)

The Gullstrand-Painlev´e metric for the Schwarzschild geometry, discussed in section 7.13, is an example of a metric that is stationary but not static (although the underlying spacetime, being Schwarzschild, is static). The Gullstrand-Painlev´e metric is independent of the free-fall time tff , so is stationary, but observers who follow the tangent vector gtff fall into the black hole, so the metric is not manifestly static. The Schwarzschild time coordinate t is thus identified as a special one: it is the unique time coordinate with respect to which the Schwarzschild geometry is manifestly static.

7.4 Spherically symmetric The Schwarzschild geometry is also spherically symmetric. This is evident from the fact that the angular part r2 do2 of the metric is the metric of a 2-sphere of radius r. This can be see as follows. Consider the metric of ordinary flat 3-dimensional Euclidean space in Cartesian coordinates {x, y, z}: ds2 = dx2 + dy 2 + dz 2 .

(7.9)

Convert to polar coordinates {r, θ, φ}, defined so that x y z

= r sin θ cos φ , = r sin θ sin φ , = r cos θ .

(7.10)

Substituting equations (7.10) into the Euclidean metric (7.9) gives ds2 = dr2 + r2 (dθ2 + sin2 θ dφ2 ) .

(7.11)

Restricting to a surface r = constant of constant radius then gives the metric of a 2-sphere of radius r ds2 = r2 (dθ2 + sin2 θ dφ2 )

(7.12)

as claimed. The radius r in Schwarzschild coordinates is the circumferential radius, defined such that the proper circumference of the 2-sphere measured by observers at rest in Schwarschild coordinates is 2πr. This is a coordinate-invariant definition of the meaning of r, which implies that r is a scalar.

7.5 Horizon

83

7.5 Horizon The horizon of the Schwarzschild geometry lies at the Schwarzschild radius r = rs rs =

2GM . c2

(7.13)

where units of c and G have been restored. Where does this come from? The Schwarzschild metric shows that the scalar spacetime distance squared ds2 along an interval at rest in Schwarzschild coordinates, dr = dθ = dφ = 0, is timelike, lightlike, or spacelike depending on whether the radius is greater than, equal to, or less than r = 2M :     < 0 if r > 2M , 2M dt2 (7.14) ds2 = − 1 − = 0 if r = 2M ,  r > 0 if r < 2M .

Since the worldline of a massive observer must be timelike, it follows that a massive observer can remain at rest only outside the horizon, r > 2M . An object at rest at the horizon, r = 2M , follows a null geodesic, which is to say it is a possible worldline of a massless particle, a photon. Inside the horizon, r < 2M , neither massive nor massless objects can remain at rest. A full treatment of what is going on requires solving the geodesic equation in the Schwarzschild geometry, but the results may be anticipated already at this point. In effect, space is falling into the black hole. Outside the horizon, space is falling less than the speed of light; at the horizon space is falling at the speed of light; and inside the horizon, space is falling faster than light, carrying everything with it. This is why light cannot escape from a black hole: inside the horizon, space falls inward faster than light, carrying light inward even if that light is pointed radially outward. The statement that space is falling superluminally inside the horizon of a black hole is a coordinate-invariant statement: massive or massless particles are carried inward whatever their state of motion and whatever the coordinate system. Whereas an interval of coordinate time t switches from timelike outside the horizon to spacelike inside the horizon, an interval of coordinate radius r does the opposite: it switches from spacelike to timelike:  −1   > 0 if r > 2M , 2M 2 2 dr ds = 1 − = 0 if r = 2M ,  r < 0 if r < 2M .

(7.15)

It appears then that the Schwarzschild time and radial coordinates swap roles inside the horizon. Inside the horizon, the radial coordinate becomes timelike, meaning that it becomes a possible worldline of a massive observer. That is, a trajectory at fixed t and decreasing r is a possible wordline. Again this reflects the fact that space is falling faster than light inside the horizon. A person inside the horizon is inevitably compelled as time goes by to move to smaller radial coordinate r.

84

Schwarzschild Black Hole

7.6 Proper time The proper time experienced by an observer at rest in Schwarzschild coordinates, dr = dθ = dφ = 0, is 1/2  p 2M dt . (7.16) dτ = −ds2 = 1 − r For an observer at rest at infinity, r → ∞, the proper time is the same as the coordinate time, dτ → dt

as r → ∞ .

(7.17)

Among other things, this implies that the Schwarzschild time coordinate t is a scalar: not only is it the unique coordinate with respect to which the metric is manifestly static, but it coincides with the proper time of observers at rest at infinity. This coordinate-invariant definition of time t implies that it is a scalar. At finite radii outside the horizon, r > 2M , the proper time dτ is less than the Schwarzchild time dt, so the clocks of observers at rest run slower at smaller than at larger radii. At the horizon, r = 2M , the proper time dτ of an observer at rest goes to zero, dτ → 0 as

r → 2M .

(7.18)

This reflects the fact that an object at rest at the horizon is following a null geodesic, and as such experiences zero proper time.

7.7 Redshift An observer at rest at infinity looking through a telesope at an emitter at rest at radius r sees the emitter redshifted by a factor −1/2  νemit dτobs 2M λobs . (7.19) = = = 1− 1+z ≡ λemit νobs dτemit r This is an example of the universally valid statement that photons are good clocks: the redshift factor is given by the rate at which the emitter’s clock appears to tick relative to the observer’s own clock. It should be emphasized that the redshift factor (7.19) is valid only for an observer and an emitter at rest in the Schwarzschild geometry. If the observer and emitter are not at rest, then additional special relativistic factors will fold into the redshift. The redshift goes to infinity for an emitter at the horizon 1 + z → ∞ as r → 2M .

(7.20)

Here the redshift tends to infinity regardless of the motion of the observer or emitter. An observer watching an emitter fall through the horizon will see the emitter appear to freeze at the horizon, becoming ever slower and more redshifted. Physically, photons emitted vertically upward at the horizon by an emitter falling through it remain at the horizon for ever, taking an infinite time to get out to the outside observer.

7.8 Proper distance

85

7.8 Proper distance The proper radial distance measure by observers at rest in Schwarzschild coordinates, dr = dθ = dφ = 0, is −1/2  √ 2M dr . (7.21) dl = ds2 = 1 − r For an observer at rest at infinity, r → ∞, an interval of proper radial distance equals an interval of circumferential radial distance, as you might expect for asymptotically flat space dl → dr

as

r→∞.

(7.22)

At the horizon, r = 2M , a proper radial interval dl measured by an observer at rest goes to infinity dl → ∞ as

r → 2M .

(7.23)

7.9 “Schwarzschild singularity” The apparent singularity in the Schwarzschild metric at the horizon r = 2M is not a real singularity, because it can be removed by a change of coordinates, such as to Gullstrand-Painlev´e coordinates (7.26). Prior to as late as the 1950s, people, including Einstein, thought that the “Schwarzschild singularity” at r = 2M marked the physical boundary of the Schwarzschild spacetime. After all, an outside observer watching stuff fall in never sees anything beyond that boundary. Schwarzschild’s choice of coordinates was certainly a natural one. It was natural to search for static solutions, and his time coordinate t is the only one with respect to which the metric is manifestly static. The problem is that physically there can be no static observers inside the horizon: they must necessarily fall inward as time passes. The fact that Schwarzschild’s coordinate system shows an apparent singularity at the horizon reflects the fact that the assumption of a static spacetime necessarily breaks down at the horizon, where space is falling at the speed of light. Does stuff “actually” fall in, even though no outside observer ever sees it happen? Classically, the answer is yes: when a black hole forms, it does actually collapse, and when an observer falls through the horizon, they really do fall through the horizon. The reason that an outside observer sees everything freeze at the horizon is simply a light travel time effect: it takes an infinite time for light to lift off the horizon and make it to the outside world.

7.10 Embedding diagram An embedding diagram is a visual aid to understanding geometry. It is a depiction of a lower dimensional geometry in a higher dimension. A classic example is the illustration of the geometry of a 2-sphere embedded in 3-dimensional space. The 2-sphere has a meaning independent of any embedding in 3 dimensions because

86

Schwarzschild Black Hole

the geometry of the 2-sphere can be measured by 2-dimensional inhabitants of its surface without reference to any encompassing 3-dimensional space. Nevertheless, the pictorial representation aids imagination. Textbooks sometimes illustrate the Schwarzschild geometry with an embedding diagram that shows the spatial geometry at a fixed instant of Schwarzschild time t. The diagram illustrates the stretching of proper distances in the radial direction. I’ll let you figure out how to construct this embedding diagram. It should be emphasized that the embedding diagram of the Schwarzshild geometry at fixed Schwarzschild time t has a limited physical meaning. Fixing the time t means choosing a certain hypersurface through the geometry. Other choices of hypersurface will yield different diagrams. For example, the Gullstrand-Painlev´e metric is spatially flat at fixed free-fall time tff , so in that case the embedding diagram would simply illustrate flat space, with no funny business at the horizon.

7.11 Energy-momentum tensor The energy-momentum tensor of the Schwarzschild geometry is zero, by construction.

7.12 Weyl tensor It turns out that the 10 components of the Weyl tensor, the tidal part of the Riemann tensor, can be decomposed in any locally inertial frame into 5 complex components of spin 0, ±1, and ±2. In the Schwarzschild metric, all components vanish except the real spin 0 component. This component is a coordinate-invariant scalar, the Weyl scalar C M (7.24) C=− 3 . r The Weyl scalar, which expresses the presence of tidal forces, goes to infinity at zero radius, C → ∞ as r → 0 ,

(7.25)

signalling the presence of a real singularity at zero radius.

7.13 Gullstrand-Painlev´ e coordinates The Gullstrand-Painlev´e metric is an alternative metric for the Schwarzschild geometry, discovered independently by Allvar Gullstrand and Paul Painlev´e in (1921). When we have done tetrads, we will recognize that the standard way in which metrics are written encodes not only metric but also a complete tetrad. The Gullstrand-Painlev´e line-element (7.26) encodes a tetrad that represents locally inertial frames free-falling radially into the black hole at the Newtonian escape velocity. Unlike Schwarzschild coordinates, there is no singularity at the horizon in Gullstrand-Painlev´e coordinates. It is striking that the mathematics was known long before physical understanding emerged.

7.14 Eddington-Finkelstein coordinates

87

The Gullstrand-Painlev´e metric is ds2 = − dt2ff + (dr − β dtff )2 + r2 do2 . Here β is the Newtonian escape velocity (with a minus sign because space is falling inward)  1/2 2M β= − r

(7.26)

(7.27)

and tff is the proper time experienced by an object that free falls radially inward from zero velocity at infinity. The free fall time tff is related to the Schwarzschild time coordinate t by dtff = dt − which integrates to

β dr , 1 − β2

   (r/2M )1/2 − 1 r 1/2 . tff = t + 2M 2 + ln 2M (r/2M )1/2 + 1

(7.28)

(7.29)

The time axis gtff in Gullstrand-Painlev´e coordinates is not orthogonal to the radial axis gr , but rather is tilted along the radial axis, gtff · gr = gtff r = −β. The proper time of a person at rest in Gullstrand-Painlev´e coordinates, dr = dθ = dφ = 0, is p (7.30) dτ = dtff 1 − β 2 .

The horizon occurs where this proper time vanishes, which happens when the infall velocity β is the speed of light |β| = 1 .

(7.31)

According to equation (7.27), this happens at r = 2M , which is the Schwarzschild radius, as it should be.

7.14 Eddington-Finkelstein coordinates In Schwarzschild coordinates, radially infalling or outfalling light rays appear never to cross the horizon of the Schwarzschild black hole. This feature of Schwarzschild coordinates contributed to the historical misconception that black holes stopped at their horizons. In 1958, David Finkelstein carried out a trivial transformation of the time coordinate which seeemed to show that infalling light rays could indeed pass through the horizon. It turned out that Eddington had already discovered the transformation in 1924, though at that time the physical implications were not grasped. Again, it is striking that the mathematics was in place long before physical understanding. In Schwarzschild coordinates, light rays that fall radially (dθ = dφ = 0) inward or outward follow null geodesics   −1  2M 2M dt2 + 1 − dr2 = 0 . (7.32) ds2 = − 1 − r r

88

Schwarzschild Black Hole

Radial null geodesics thus follow   dr 2M =± 1− dt r

(7.33)

in which the ± sign is + for outfalling, − for infalling rays. Equation (7.33) shows that dr/dt → 0 as r → 2M , suggesting that null rays, whether infalling or outfalling, never cross the horizon. The solution to equation (7.33) is t = ± (r + 2M ln|r − 2M |) ,

(7.34)

which shows that Schwarzschild time t approaches ±∞ logarithmically as null rays approach the horizon. Finkelstein defined his time coordinate tF by tF ≡ t + 2M ln |r − 2M | ,

(7.35)

which has the property that infalling null rays follow tF + r = 0 .

(7.36)

In other words, on a spacetime diagram in Finkelstein coordinates, radially infalling light rays move at 45◦ , the same as in special relativistic spacetime diagrams.

7.15 Kruskal-Szekeres coordinates Since Finkelstein transformed coordinates so that radially infalling light rays moved at 45◦ in a spacetime diagram, it is natural to look for coordinates in which outfalling as well as infalling light rays are at 45◦ . Kruskal and Szekeres independently provided such a transformation, in 1960. Define the tortoise (or Regge-Wheeler 1959) coordinate r∗ by Z dr = r + 2M ln |r − 2M | . (7.37) r∗ ≡ 1 − 2M/r

Then radially infalling and outfalling null rays follow

r∗ + t = 0 infalling , r∗ − t = 0 outfalling .

(7.38)

In a spacetime diagram in coordinates t and r∗ , infalling and outfalling light rays are indeed at 45◦ . Unfortunately the metric in these coordinates is still singular at the horizon r = 2M :    2M (7.39) − dt2 + dr∗2 + r2 do2 . ds2 = 1 − r

The singularity at the horizon can be eliminated by the following transformation into Kruskal-Szekeres coordinates tK and rK : ∗ rK + tK = e(r +t)/2 , ∗ (7.40) rK − tK = ±e(r −t)/2 ,

7.16 Penrose diagrams

89

Singularity (r = 0)

Black Hole r

gh t

r

=



Li

Time

H



or iz on

=

on iz or ih nt A

t gh Li

Universe

Space

Figure 7.1 Penrose diagram of the Schwarzschild geometry.

where the ± sign in the last equation is + outside the horizon, − inside the horizon. The Kruskal-Szekeres metric is  2 + r2 do2 , (7.41) ds2 = r−1 e−r − dt2K + drK

which is non-singular at the horizon. The Schwarzschild radial coordinate r, which appears in the factors r−1 e−r and r2 in the Kruskal metric, is to be understood as an implicit function of the Kruskal coordinates tK and rK .

7.16 Penrose diagrams Roger Penrose, as so often, had a novel take on the business of spacetime diagrams. Penrose conceived that the primary purpose of a spacetime diagram should be to portray the causal structure of the spacetime, and that the specific choice of coordinates was largely irrelevant. After all, general relativity allows arbitrary choices of coordinates. In addition to requiring that light rays be at 45◦ , Penrose wanted to bring regions at infinity (in time or space) to a finite position on the spacetime diagram, so that the entire spacetime could be seen at once. He calls these thing conformal diagrams, but the rest of us commonly call them Penrose diagrams. Penrose diagrams are bona-fide spacetime diagrams. For example, a coordinate transformation from Kruskal to “Penrose” coordinates (the following transformation is not analytic, but Penrose does not care) rK + tK , 2 + |rK + tK | rK − tK rP − tP = , 2 − |rK − tK |

rP + tP =

(7.42)

90

Schwarzschild Black Hole

brings spatial and temporal infinity to finite values of the coordinates, while keeping infalling and outfalling light rays at 45◦ in the spacetime diagram. However, there are many such transformations, and Penrose would be the last person to advocate any one of them in particular. r=0



on iz or

H or iz on

∞ =

=

r

r

lH lle ra Pa

Black Hole

∞ r

=

lA nt lle ra



on

= Pa

Universe

iz

r

or ih nt A

ih or iz on

Parallel Universe

White Hole r=0

Figure 7.2 Penrose diagram of the complete, analytically extended Schwarzschild geometry.

7.17 Schwarzschild white hole, wormhole The Kruskal-Szekeres spacetime diagram reveals a new feature that was not apparent in Schwarzschild or Finkelstein coordinates. Dredged from the depths of t = −∞ appears a null line rK + tK = 0. The null line is at radius r = 2M , but it does not correspond to the horizon that a person might fall into. The null line is called the antihorizon. The horizon is sometimes called the true horizon, and the antihorizon the illusory horizon. In a real black hole, only the true horizon is real. The antihorizon is replaced by an exponentially dimming and redshifting image of the star that collapsed to form the black hole. The Kruskal-Szekeres (= Schwarzschild) geometry is analytic, and there is a unique analytic continuation of the geometry through the antihorizon. The analytic continuation is a time-reversed copy of the original Schwarzschild geometry, glued at the antihorizon. Whereas the original Schwarzschild geometry showed an asymptotically flat region and a black hole region separated by a horizon, the complete analytically extended Schwarzschild geometry shows two asymptotically flat regions, together with a black hole and a white hole. Relativists label the regions I, II, III, and IV, but I like to call them by name: “Universe”, “Black Hole”, “Parallel Universe”, and “White Hole”. The white hole is a time-reversed version of the black hole. Whereas space falls inward faster than light inside the black hole, space falls outward faster than light inside the white hole. In the Gullstrand-Painlev´e metric (7.26), the velocity β = ±(2M/r)1/2 is negative for the black hole, positive for the white hole. The Kruskal or Penrose diagrams show that the universe and the parallel universe are connected, but

7.18 Collapse to a black hole

91

only by spacelike lines. This spacelike connection is called the Einstein-Rosen bridge, and constitutes a wormhole connecting the two universes. Because the connection is spacelike, it is impossible for a traveler to pass through this wormhole. Although two travelers, one from the universe and one from the parallel universe, cannot travel to each other’s universe, they can meet, but only inside the black hole. Inside the black hole, they can talk to each other, and they can see light from each other’s universe. Sadly, the enlightenment is only temporary, because they are doomed soon to hit the central singularity. It should be emphasized that the white hole and the wormhole in the Schwarzschild geometry are a mathematical construction with as far as anyone knows no relevance to reality. Nevertheless it is intriguing that such bizarre objects emerge already in the simplest general relativistic solution for a black hole.

7.18 Collapse to a black hole Realistic collapse of a star to a black hole is not expected to produce a white hole or parallel universe. The simplest model of a collapsing star is a spherical ball of uniform density and zero pressure which free falls from zero velocity at infinity. In this simple model, the interior of the star is described by a collapsing Friedmann-Robertson-Walker metric (the canonical cosmological metric), while the exterior is described by the Schwarzschild solution. The assumption that the star collapses from zero velocity at infinity implies that the FRW metric is spatially flat, the simplest case. To continue the geometry between Schwarzchild and FRW metrics, it is neatest to use the Gullstrand-Painlev´e metric, with the Gullstrand-Painlev´e infall velocity β at the edge of the star set equal to minus r times the Hubble parameter −rH ≡ −r d ln a/dt of the collapsing FRW metric. The simple model shows that the antihorizon of the complete Schwarzschild geometry is replaced by the surface of the collapsing star, and that beyond the antihorizon is not a parallel universe and a white hole, but merely the interior of the star (and the distant Universe glimpsed through the star’s interior). Since light can escape from the collapsing star system as long as it is even slightly larger than its Schwarzschild radius, it is possible to take the view that the horizon comes instantaneously into being at the moment the star collapses through its Schwarzschild radius. This definition of the horizon is called the apparent horizon. Hawking has advocated that a better definition of the horizon is to take it to be the boundary between outgoing null rays that fall into the black hole versus those that go to infinity. In any evolving situation, this definition of the horizon, which is called the absolute horizon, depends formally on what happens in the infinite future, though in slowly evolving systems the absolute horizon can be located with some precision without knowing the future. The absolute horizon of the collapsing star forms before the star has collapsed, and grows to meet the apparent horizon as the star falls through its Schwarzchild radius. In this simple model, the central singularity forms slightly before the star has collapsed to zero radius. The formation of the singularity is marked by the fact that light rays emitted at zero radius cease to be able to move outward. In other words, the singularity forms when space starts to fall into it faster than light.

92

Schwarzschild Black Hole

7.19 Killing vectors The Schwarzschild metric presents an opportunity to introduce the concept of Killing vectors (after Wilhelm Killing, not because the vectors kill things, though the latter is true), which are associated with symmetries of the spacetime.

7.20 Time translation symmetry The time translation invariance of the Schwarzschild geometry is evident from the fact that the metric is independent of the time coordinate t. Equivalently, the partial time derivative ∂/∂t of the Schwarzschild metric is zero. The associated Killing vector ξ µ is then defined by ξµ

∂ ∂ = ∂xµ ∂t

(7.43)

so that in Schwarzschild coordinates {t, r, θ, φ} ξ µ = {1, 0, 0, 0} .

(7.44)

In coordinate-independent notation, the Killing vector is ξ = gµ ξ µ = gt .

(7.45)

This may seem like overkill – couldn’t we just say that the metric is independent of time t and be done with it? The answer is that symmetries are not always evident from the metric, as will be seen in the next section 7.21. Because the Killing vector gt is the unique timelike Killing vector of the Schwarzschild geometry, it has a definite meaning independent of the coordinate system. It follows that its scalar product with itself is a coordinate-independent scalar   2M µ ξµ ξ = gt · gt = gtt = − 1 − . (7.46) r In curved spacetimes, it is hugely important to be able to identify scalars, which have a physical meaning independent of the choice of coordinates.

7.21 Spherical symmetry The rotational symmetry of the Schwarzschild metric about the azimuthal axis is evident from the fact that the metric is independent of the azimuthal coordinate φ. The associated Killing vector is gφ with components {0, 0, 0, 1} in Schwarzschild coordinates {t, r, θ, φ}.

(7.47)

7.22 Killing equation

93

The Schwarzschild metric is fully spherically symmetric, not just azimuthally symmetric. Since the 3D rotation group O(3) is 3-dimensional, it is to be expected that there are three Killing vectors. You may recognize from quantum mechanics that ∂/∂φ is (modulo factors of i and ~) the z-component of the angular momentum operator L = {Lx , Ly , Lz } in a coordinate system where the azimuthal axis is the z-axis. The 3 components of the angular momentum operator are given by: iLx iLy iLz

∂ ∂ −z ∂z ∂y ∂ ∂ −x = z ∂x ∂z ∂ ∂ = x −y ∂y ∂x =

y

= = =

∂ ∂ − cot θ cos φ , ∂θ ∂φ ∂ ∂ cos φ − cot θ sin φ , ∂θ ∂φ ∂ . ∂φ

− sin φ

(7.48)

The 3 rotational Killing vectors are correspondingly: rotation about x-axis: − sin φ gθ − cot θ cos φ gφ , rotation about y-axis: cos φ gθ − cot θ sin φ gφ , rotation about z-axis: gφ .

(7.49)

You can check that the action of the x and y rotational Killing vectors on the metric does not kill the metric. For example, iLxgφφ = 2r2 cos φ sin θ cos θ does not vanish. This example shows that a more powerful and general condition, described in the next section 7.22, is needed to establish whether a quantity is or is not a Killing vector. Because spherical symmetry does not define a unique azimuthal axis gφ , its scalar product with itself gφ · gφ = gφφ = −r2 sin2 θ is not a coordinate-invariant scalar. However, the sum of the scalar products of the 3 rotational Killing vectors is rotationally invariant, and is therefore a coordinate-invariant scalar ( − sin φ gθ − cot θ cos φ gφ )2 + (cos φ gθ − cot θ sin φ gφ )2 + gφ2 = gθθ + (cot2 θ + 1)gφφ = −2r2 .

(7.50)

This shows that the circumferential radius r is a scalar, as you would expect.

7.22 Killing equation As seen in the previous section, a Killing vector does not always kill the metric in a given coordinate system. This is not really surprising given the arbitrariness of coordinates in GR. What is true is that a quantity is a Killing vector if and only if there exists a coordinate system such that the Killing vector kills the metric in that system. Suppose that in some coordinate system the metric is independent of the coordinate φ. In problem set 2 you showed that in such a case the covariant component uφ of the 4-velocity along a geodesic is constant uφ = constant .

(7.51)

ξ ν uν = constant

(7.52)

Equivalently

94

Schwarzschild Black Hole

where ξ ν is the associated Killing vector, whose only non-zero component is ξ φ = 1 in this particular coordinate system. The converse is also true: if ξ ν uν = constant along all geodesics, then ξ ν is a Killing vector. The constancy of ξ ν uν along all geodesics is equivalent to the condition that its proper time derivative vanish along all geodesics dξ ν uν =0. (7.53) dτ But this is equivalent to 1 µ ν u u (Dµ ξν + Dν ξµ ) (7.54) 2 where the second equality follows from the geodesic equation, uµ Dµ uν = 0, and the last equality is true because of the symmetry of uµ uν in µ ↔ ν. A necessary and sufficient condition for equation (7.54) to be true for all geodesics is that 0 = uµ Dµ (ξ ν uν ) = uµ uν Dµ ξν =

Dµ ξν + Dν ξµ = 0

(7.55)

which is Killing’s equation. This equation is the desired necessary and sufficient condition for ξ ν to be a Killing vector. It is a generally covariant equation, valid in any coordinate system.

8 Reissner-Nordstr¨om Black Hole

The Reissner-Nordstr¨ om geometry, discovered independently by Hans Reissner in 1916, Hermann Weyl in 1917, and Gunnar Nordstr¨ om in 1918, describes the unique spherically symmetric static solution for a black hole with mass and electric charge in asymptotically flat spacetime.

8.1 Reissner-Nordstr¨ om metric The Reissner-Nordstr¨ om metric for a black hole of mass M and electric charge Q is, in geometric units c = G = 1,   −1  2M Q2 Q2 2M 2 2 (8.1) + 2 dt + 1 − + 2 dr2 + r2 do2 ds = − 1 − r r r r which looks like the Schwarzschild metric with the replacement Q2 . (8.2) 2r In fact equation (8.2) has a coordinate independent interpretation as the mass M (r) interior to radius r, which here is the mass M at infinity, minus the mass in the electric field E = Q/r2 outside r Z ∞ Z ∞ 2 Q2 Q2 E 4πr2 dr = 4πr2 dr = . (8.3) 4 8π 8πr 2r r r M → M (r) = M −

This seems like a Newtonian calculation of the energy in the electric field, but it turns out to be valid also in general relativity. Real astronomical black holes probably have very little electric charge, because the Universe as a whole appears almost electrically neutral (and Maxwell’s equations in fact require that the Universe in its entirety should be exactly electrically neutral), and a charged black hole would quickly neutralize itself. It would probably not neutralize itself completely, but have some small residual positive charge, because protons (positive charge) are more massive than electrons (negative charge), so it is slightly easier for a black hole to accrete protons than electrons.

96

Reissner-Nordstr¨ om Black Hole

Nevertheless, the Reissner-Nordstr¨ om solution is of more than passing interest because its internal geometry resembles that of the Kerr solution for a rotating black hole.

Concept question 8.1

What is the charge Q in standard (gaussian) units?

8.2 Energy-momentum tensor The Einstein tensor  Gtt  0  Gνµ =   0 0

of the Reissner-Nordstr¨ om metric (8.1) is diagonal, with elements     0 0 0 −ρ 0 0 0 −1 0  0 pr 0  Q2  0 −1 Grr 0 0  0  =   = 8π   0 0 Gθθ 0  0 p⊥ 0  0 r4  0 φ 0 0 0 p⊥ 0 0 Gφ 0 0

given by 0 0 1 0

 0 0   . 0  1

(8.4)

The trick of writing one index up and the other down on the Einstein tensor Gνµ partially cancels the distorting effect of the metric, yielding the proper energy density ρ, the proper radial pressure pr , and transverse pressure p⊥ , up to factors of ±1. A more systematic way to extract proper quantities is to work in the tetrad formalism, but this will do for now. The energy-momentum tensor is that of a radial electric field E=

Q . r2

(8.5)

Notice that the radial pressure pr is negative, while the transverse pressure p⊥ is positive. It is no coincidence that the sum of the energy density and pressures is twice the energy density, ρ + pr + 2p⊥ = 2ρ. The negative pressure, or tension, of the radial electric field produces a gravitational repulsion that dominates at small radii, and that is responsible for much of the strange phenomenology of the Reissner-Nordstr¨ om geometry. The gravitational repulsion mimics the centrifugal repulsion inside a rotating black hole, for which reason the Reissner-Nordstr¨ om geometry is often used a surrogate for the rotating Kerr-Newman geometry. At this point, the statements that the energy-momentum tensor is that of a radial electric field, and that the radial tension produces a gravitational repulsion that dominates at small radii, are true but unjustified assertions.

8.3 Weyl tensor As with the Schwarzschild geometry (indeed, any spherically symmetric geometry), only 1 of the 10 independent spin components of the Weyl tensor is non-vanishing, the real spin-0 component, the Weyl scalar

8.4 Horizons

97

C. The Weyl scalar for the Reissner-Nordstr¨ om geometry is Q2 M + . r3 r4

(8.6)

C → ∞ as r → 0

(8.7)

C= − The Weyl scalar goes to infinity at zero radius

signalling the presence of a real singularity at zero radius.

8.4 Horizons The Reissner-Nordstr¨ om geometry has not one but two horizons. The horizons occur where an object at rest in the geometry, dr = dθ = dφ = 0, follows a null geodesic, ds2 = 0, which occurs where 1−

Q2 2M + 2 =0. r r

(8.8)

This is a quadratic equation in r, and it has two solutions, an outer horizon r+ and an inner horizon r− p r± = M ± M 2 − Q2 . (8.9)

It is straightforward to check that the Reissner-Nordstr¨ om time coordinate t is timelike outside the outer horizon, r > r+ , spacelike between the horizons r− < r < r+ , and again timelike inside the inner horizon r < r− . Conversely, the radial coordinate r is spacelike outside the outer horizon, r > r+ , timelike between the horizons r− < r < r+ , and spacelike inside the inner horizon r < r− . The physical meaning of this strange behaviour is akin to that of the Schwarzschild geometry. As in the Schwarzschild geometry, outside the outer horizon space is falling at less than the speed of light; at the outer horizon space hits the speed of light; and inside the outer horizon space is falling faster than light. But a new ingredient appears. The gravitational repulsion caused by the negative pressure of the electric field slows down the flow of space, so that it slows back down to the speed of light at the inner horizon. Inside the inner horizon space is falling at less than the speed of light.

8.5 Gullstrand-Painlev´ e metric Deeper insight into the Reissner-Nordstr¨ om geometry comes from examining its Gullstrand-Painlev´e metric. The Gullstrand-Painlev´e metric for the Reissner-Nordstr¨ om geometry is the same as that for the Schwarzschild geometry ds2 = − dt2ff + (dr − β dtff )2 + r2 do2 .

(8.10)

98

Reissner-Nordstr¨ om Black Hole

The velocity β is again the escape velocity, but this is now r 2M (r) β=∓ r

(8.11)

where M (r) = M − Q2 /2r is the interior mass already given as equation (8.2). Horizons occur where the magnitude of the velocity β equals the speed of light |β| = 1

(8.12)

which happens at the outer and inner horizons r = r+ and r = r− , equation (8.9). The Gullstrand-Painlev´e metric once again paints the picture of space falling into the black hole. Outside the outer horizon r+ space falls at less than the speed of light, at the horizon space falls at the speed of light, and inside the horizon space falls faster than light. But the gravitational repulsion produced by the tension of the radial electric field starts to slow down the inflow of space, so that the infall velocity reaches a maximum at r = Q2 /M . The infall slows back down to the speed of light at the inner horizon r− . Inside the inner horizon, the flow of space slows all the way to zero velocity, β = 0, at the turnaround radius r0 =

Q2 . 2M

(8.13)

Space then turns around, the velocity β becoming positive, and accelerates back up to the speed of light. Space is now accelerating outward, to larger radii r. The outfall velocity reaches the speed of light at the inner horizon r− , but now the motion is outward, not inward. Passing back out through the inner horizon, space is falling outward faster than light. This is not the black hole, but an altogether new piece of spacetime, a white hole. The white hole looks like a time-reversed black hole. As space falls outward, the gravitational repulsion produced by the tension of the radial electric field declines, and the outflow slows. The outflow slows back to the speed of light at the outer horizon r+ of the white hole. Outside the outer horizon of the white hole is a new universe, where once again space is flowing at less than the speed of light. What happens inward of the turnaround radius r0 , equation (8.13)? Inside this radius the interior mass M (r), equation (8.2), is negative, and the velocity β is imaginary. The interior mass M (r) diverges to negative infinity towards the central singularity at r → 0. The singularity is timelike, and infinitely gravitationally repulsive, unlike the central singularity of the Schwarzschild geometry. Is it physically realistic to have a singularity that has infinite negative mass and is infinitely gravitationally repulsive? Undoubtedly not.

8.6 Complete Reissner-Nordstr¨ om geometry As with the Schwarzschild geometry, it is possible to go through the steps: Reissner-Nordstr¨ om coordinates → Eddington-Finkelstein coordinates → Kruskal-Szekeres coordinates → Penrose coordinates. The conclusion of these constructions is that the Reissner-Nordstr¨ om geometry can be analytically continued, and the complete analytic continuation consists of an infinite ladder of universes and parallel universes connected to each other by black hole → wormhole → white hole tunnels. I like to call the various pieces of spacetime

8.6 Complete Reissner-Nordstr¨ om geometry

New Universe

or ih A nt er

Parallel Antiverse = r

on

iz

In n

or H

−∞

er

er

=

−∞

lI nn le al

Pa r on

nn lI

iz

lle

r

H or

−∞

ra

Pa

Parallel Wormhole

on

iz

or

Wormhole

=

ih

=

r

nt A

−∞

er

r

iz

White Hole

n In

Antiverse

on

New Parallel Universe

99

Black Hole iz



on

iz

H or

∞ =

=

or lH

r

r

le

on

al

r Pa on

Pa

∞ =

on



Figure 8.1 Penrose diagram of the complete Reissner-Nordstr¨ om geometry.

r

or A nt ih

riz

=

lle l

o ih

r ra

Universe nt A

iz

Parallel Universe

100

Reissner-Nordstr¨ om Black Hole

“Universe”, “Parallel Universe”, “Black Hole”, “Wormhole”, “Parallel Wormhole”, and “White Hole”. These pieces repeat in an infinite ladder. The Wormhole and Parallel Wormhole contain separate central singularities, the “Singularity” and the “Parallel Singularity”, which are oppositely charged. If the black hole is positively charged as measured by observers in the Universe, then it is negatively charged as measured by observers in the Parallel Universe, and the Wormhole contains a positive charge singularity while the Parallel Wormhole contains a negative charge singularity. Where does the electric charge of the Reissner-Nordstr¨ om geometry “actually” reside? This comes down to the question of how observers detect the presence of charge. Observers detect charge by the electric field that it produces. Equip all (radially moving) observers with a gyroscope that they orient consistently in the same radial direction, which can be taken to be towards the black hole as measured by observers in the Universe. Observers in the Parallel Universe find that their gyroscope is pointed away from the black hole. Inside the black hole, observers from either Universe agree that the gyroscope is pointed towards the Wormhole, and away from the Parallel Wormhole. All observes agree that the electric field is pointed in the same radial direction. Observers who end up inside the Wormhole measure an electric field that appears to emanate from the Singularity, and which they therefore attribute to charge in the Singularity. Observers who end up inside the Parallel Wormhole measure an electric field that appears to emanate in the opposite direction from the Parallel Singularity, and which they therefore attribute to charge of opposite sign in the Parallel Singularity. Strange, but all consistent.

8.7 Antiverse: Reissner-Nordstr¨ om geometry with negative mass It is also possible to consider the Reissner-Nordstr¨ om geometry for negative values of the radius r. I call the extension to negative r the “Antiverse”. There is also a “Parallel Antiverse”. Changing the sign of r in the Reissner-Nordstr¨ om metric (8.1) is equivalent to changing the sign of the mass M . Thus the Reissner-Nordstr¨ om metric with negative r describes a charged black hole of negative mass M M ,

(8.19)

has no horizons. The change in geometry from an extremal black hole, with horizon at finite radius r+ = r− = M , to one without horizons is discontinuous. This suggests that there is no way to pack a black hole with more charge than its mass. Indeed, if you try to force additional charge into an extremal black hole, then the work needed to do so increases its mass so that the charge Q does not exceed its mass M . Real fundamental particles nevertheless have charge far exceeding their mass. For example, the chargeto-mass ratio of a proton is e ≈ 1018 (8.20) mp where e is the square root of the fine-structure constant α ≡ e2 /~c ≈ 1/137, and mp ≈ 10−19 is the mass of the proton in Planck units. However, the Schwarzschild radius of such a fundamental particle is far tinier than its Compton wavelength ∼ ~/m (or its classical radius e2 /m = α~/m), so quantum mechanics, not general relativity, governs the structure of these fundamental particles.

8.15 Reissner-Nordstr¨ om geometry with imaginary charge It is possible formally to consider the Reissner-Nordstr¨ om geometry with imaginary charge Q Q2 < 0 .

(8.21)

This is completely unphysical. If charge were imaginary, then electromagnetic energy would be negative. However, the Reissner-Nordstr¨ om metric with Q2 < 0 is well-defined, and it is possible to calculate geodesics in that geometry. What makes the geometry interesting is that the singularity, instead of being gravitationally repulsive, becomes gravitationally attractive. Thus particles, instead of bouncing off the singularity, are attracted to it, and it turns out to be possible to continue geodesics through the singularity. Mathematically, the geometry can be considered as the Kerr-Newman geometry in the limit of zero spin. In

8.15 Reissner-Nordstr¨ om geometry with imaginary charge

New Parallel Universe

107

New Universe

nn er A nt ih or iz on

White Hole Singularity i

on iz or ih nt rA ne In

lle lI ra

Pa

In

−∞ =

on

iz

or H

−∞

ne

=

er

nn

rH or

lI

iz

le

on

al

r

Black Hole Singularity (r = 0)

r

−∞ =

−∞

r

=

Parallel Antiverse

r Pa

Antiverse

r

White Hole

Pa

iz





on

iz

H or

=

=

or lH

r

r

lle

ra

Black Hole

on

i

on

Universe

Pa

∞ = r

ra

on



lle

=

iz

or

r

lA nt

ih

ih

or

nt A

iz

Parallel Universe

Singularity i

Figure 8.4 Penrose diagram of the Reissner-Nordstr¨ om geometry with imaginary charge Q. If charge were imaginary, then electromagnetic energy would be negative, which is completely unphysical. But the metric is well-defined, and the spacetime is fun.

the Kerr-Newman geometry, geodesics can pass from positive to negative radius r, and the passage through the singularity of the Reissner-Nordstr¨ om geometry can be regarded as this process in the limit of zero spin. Suffice to say that it is intriguing to see what it looks like to pass through the singularity of a charged

108

Reissner-Nordstr¨ om Black Hole

black hole of imaginary charge, however unrealistic. The Penrose diagram is even more eventful than that for the usual Reissner-Nordstr¨ om geometry.

9 Kerr-Newman Black Hole

The geometry of a stationary, rotating, uncharged black hole in asymptotically flat empty space was discovered unexpectedly by Roy Kerr in 1963. Kerr’s (2007) own account of the history of the discovery is at http://arxiv.org/abs/0706.1109. You can read in that paper that the discovery was not mere chance: Kerr used sophisticated mathematical methods to make it. The extension to a rotating electrically charged black hole was made shortly thereafter by Ted Newman (Newman et al. 1965). Newman told me (private communication 2009) that, after seeing Kerr’s work, he quickly realized that the extension to a charged black hole was straightforward. He set the problem to the graduate students in his relativity class, who became coauthors of Newman et al. (1965). The importance of the Kerr-Newman geometry stems in part from the no-hair theorem, which states that this geometry is the unique end state of spacetime outside the horizon of an undisturbed black hole in asymptotically flat space.

9.1 Boyer-Lindquist metric The Boyer-Linquist metric of the Kerr-Newman geometry is ds2 = −

 2 ρ2 2 a 2 R4 sin2 θ  ∆ 2 2 2 dφ − dt − a sin θ dφ + dt dr + ρ dθ + ρ2 ∆ ρ2 R2

(9.1)

where R and ρ are defined by

R≡

p r 2 + a2 ,

and ∆ is the horizon function defined by

ρ≡

p r2 + a2 cos2 θ ,

∆ ≡ R2 − 2M r + Q2 . At large radius r, the Boyer-Linquist metric is      4aM sin2 θ 2M 2M dt2 + 1 + dr2 + r2 dθ2 + sin2 θ dφ2 − dtdφ . ds2 → − 1 − r r r

(9.2)

(9.3)

(9.4)

110

Kerr-Newman Black Hole

Comparison of this metric to the metric of a weak field establishes that M is the mass of the black hole and a is its angular momentum per unit mass. For positive a, the black hole rotates right-handedly about its polar axis θ = 0. The Boyer-Linquist line-element (9.1) defines not only a metric but also a tetrad. The Boyer-Linquist coordinates and tetrad are carefully chosen to exhibit the symmetries of the geometry. In the locally inertial frame defined by the Boyer-Linquist tetrad, the energy-momentum tensor (which is non-vanishing for charged Kerr-Newman) and the Weyl tensor are both diagonal. These assertions becomes apparent only in the tetrad frame, and are obscure in the coordinate frame.

9.2 Oblate spheroidal coordinates Boyer-Linquist coordinates r, θ, φ are oblate spheroidal coordinates (not polar coordinates). Corresponding Cartesian coordinates are x y z

= R sin θ cos φ , = R sin θ sin φ , = r cos θ .

(9.5)

Surfaces of constant r are confocal oblate spheroids, satisfying x2 + y 2 z2 + =1. r 2 + a2 r2

(9.6)

Equation (9.6) implies that the spheroidal coordinate r is given in terms of x, y, z by the quadratic equation r4 − r2 (x2 + y 2 + z 2 − a2 ) − a2 z 2 = 0 .

(9.7)

9.3 Time and rotation symmetries The Boyer-Linquist metric coefficients are independent of the time coordinate t and of the azimuthal angle φ. This shows that the Kerr-Newman geometry has time translation symmetry, and rotational symmetry about its azimuthal axis. The time and rotation symmetries means that the tangent vectors gt and gφ in Boyer-Linquist coordinates are Killing vectors. It follows that their scalar products  1 ∆ − a2 sin2 θ , 2 ρ  a sin2 θ R2 − ∆ , = − 2 ρ 2  sin θ R4 − a2 sin2 θ ∆ , = 2 ρ

gt · gt = gtt = − gt · gφ = gtφ gφ · gφ = gφφ

(9.8)

9.4 Ring singularity

111

are all gauge-invariant scalar quantities. As will be seen below, gtt = 0 defines the boundary of ergospheres, gtφ = 0 defines the turnaround radius, and gφφ = 0 defines the boundary of the toroidal region containing closed timelike curves. The Boyer-Linquist time t and azimuthal angle φ are arranged further to satisfy the condition that gt and gφ are each orthogonal to both gr and gθ .

9.4 Ring singularity The Kerr-Newman geometry contains a ring singularity where the Weyl tensor (9.21) diverges, ρ = 0, or equivalently at r = 0 and θ = π/2 .

(9.9)

The ring singularity is at the focus of the confocal ellipsoids of the Boyer-Linquist metric. Physically, the singularity is kept open by the centrifugal force.

9.5 Horizons The horizon of a Kerr-Newmman black hole rotates, as observed by a distant observer, so it is incorrect to try to solve for the location of the horizon by assuming that the horizon is at rest. The worldline of a photon that sits on the horizon, battling against the inflow of space, remains at fixed radius r and polar angle θ, but it moves in time t and azimuthal angle φ. The photon’s 4-velocity is v µ = {v t , 0, 0, v φ }, and the condition that it is on a null geodesic is 0 = vµ v µ = gµν v µ v ν = gtt (v t )2 + 2 gtφ v t v φ + gφφ (v φ )2 .

(9.10)

This equation has solutions provided that the determinant of the 2 × 2 matrix of metric coefficients in t and φ is less than or equal to zero (why?). The determinant is 2 gtt gφφ − gtφ = − sin2 θ ∆

(9.11)

where ∆ is the horizon function defined above, equation (9.3). Thus if ∆ ≥ 0, then there exist null geodesics such that a photon can be instantaneously at rest in r and θ, whereas if ∆ < 0, then no such geodesics exist. The boundary ∆=0

(9.12)

defines the location of horizons. With ∆ given by equation (9.3), equation (9.12) gives outer and inner horizons at p (9.13) r± = M ± M 2 − Q2 − a2 . Between the horizons ∆ is negative, and photons cannot be at rest. This is consistent with the picture that space is falling faster than light between the horizons.

Rotation axis

Kerr-Newman Black Hole

O u ter h o riz o n

Erg os

Ring singularity

ere ph

I n n er h o riz o n

r=0 CTCs Rotation axis

112

O u ter h o riz o n

In ne

Er go

r h o riz o n

re he sp

T urnaro u n d

Ring r=0 singularity

CTCs

Figure 9.1 Geometry of (upper) a Kerr black hole with spin parameter a = 0.96M , and (lower) a KerrNewman black hole with charge Q = 0.8M and spin parameter a = 0.56M . The upper half of each diagram shows r ≥ 0, while the lower half shows r ≤ 0, the Antiverse. The outer and inner horizons are confocal oblate spheroids whose focus is the ring singularity. For the Kerr geometry, the turnaround radius is at r = 0. CTCs are closed timelike curves.

9.6 Angular velocity of the horizon

113

Figure 9.2 Not a mouse’s eye view of a snake coming down its mousehole, uhoh. Contours of constant ρ, and their normals, in Boyer-Linquist coordinates, in a Kerr black hole of spin parameter a = 0.96M . The thicker contours are the outer and inner horizons, which are confocal spheroids with the ring singularity at their focus.

9.6 Angular velocity of the horizon The Boyer-Linquist metric (9.1) has been cunningly written so that you can read off the angular velocity of the horizon as observed by observers at rest at infinity. The horizon is at dr = dθ = 0 and ∆ = 0, and then the null condition ds2 = 0 implies that the angular velocity is dφ a = 2 . dt R

(9.14)

The derivative is with respect to the proper time t of observers at rest at infinity, so this is the angular velocity observed by such observers.

9.7 Ergospheres There are finite regions, just outside the outer horizon and just inside the inner horizon, within which the worldline of an object at rest, dr = dθ = dφ = 0, is spacelike. These regions, called ergospheres, are places where nothing can remain at rest (the place where little children come from). Objects can escape from within the outer ergosphere (whereas they cannot escape from within the outer horizon), but they cannot remain

114

Kerr-Newman Black Hole

at rest there. A distant observer will see any object within the outer ergosphere being dragged around by the rotation of the black hole. The direction of dragging is the same as the rotation direction of the black hole in both outer and inner ergospheres. The boundary of the ergosphere is at gtt = 0

(9.15)

∆ = a2 sin2 θ .

(9.16)

which occurs where

Equation (9.16) has two solutions, the outer and inner ergospheres. The outer and inner ergospheres touch respectively the outer and inner horizons at the poles, θ = 0 and π.

9.8 Antiverse The surface at zero radius, r = 0, forms a disk bounded by the ring singularity. Objects can pass through this disk into the region at negative radius, r < 0, the Antiverse. The Boyer-Lindquist metric (9.1) is unchanged by a symmetry transformation that simultaneously flips the sign both of the radius and mass, r → −r and M → −M . Thus the Boyer-Linquist geometry at negative r with positive mass is equivalent to the geometry at positive r with negative mass. In effect, the Boyer-Linquist metric with negative r describes a rotating black hole of negative mass M 0.

122

Kerr-Newman Black Hole

4. Reissner-Nordstr¨ om case: show that if Q2 > 0 and a = 0, then a particle can reach the singularity only if it has zero angular momentum, Q = Lz = 0, and if the particle’s charge-to-mass exceeds unity, q2 ≥ 1. m2

(9.47)

5. Kerr case: show that if Q = 0 but a2 > 0, then a particle can reach the singularity only if Q = 0, and provided that the mass of the black hole is positive, M > 0. 6. Kerr-Newman case: show that if Q2 > 0 and a2 > 0, then a particle can reach the singularity only if Lz = aE and Q = 0, and if the particle’s charge-to-mass is large enough, q2 Q 2 + a2 ≥ , 2 m Q2

(9.48)

which generalizes the Reissner-Nordstr¨ om condition (9.47).

9.16 Penrose process Trajectories in the Kerr-Newman geometry can have negative energy E outside the horizon. It is possible to reduce the mass M of the black hole by dropping negative energy particles into the black hole. This process of extracting mass-energy from the black hole is called the Penrose process. Exercise 9.2 Negative energy trajectories outside the horizon. Under what conditions can test particles have negative energy trajectories, E < 0, outside the horizon? 1. Argue that outside the horizon, the positivity of the horizon function ∆, and of the radial and angular potentials R and Θ, equations (9.41) implies that P , equation (9.38), satisfies " # 2  Lz 2 2 2 2 2 P ≥ K+m r ∆ ≥ aE sin θ − +m ρ ∆ . (9.49) sin θ 2. Argue that the condition (9.49) implies by continuity that for a massive particle P must be strictly positive outside the horizon. Extend your argument to a massless particle by taking a massless particle as a massive particle in the limit of large energy. 3. Argue that the positivity of P implies that aLz + qQr must be negative for the energy E to be negative. Show that, more stringently, negative E requires that s  L2z 2 ρ2 ∆ . (9.50) + m aLz + qQr ≤ − sin2 θ

4. Argue that for an uncharged particle, q = 0, negative energy trajectories exist only inside the ergosphere. 5. Do negative energy trajectories exist outside the ergosphere for a charged particle?

9.17 Constant latitude trajectories in the Kerr-Newman geometry

123

6. For the Penrose process to work, the negative energy particle must fall through the horizon, where ∆ = 0. Does this happen? Exercise 9.3

When can objects go forwards or backwards in time t?

9.17 Constant latitude trajectories in the Kerr-Newman geometry A trajectory is at constant latitude if it is at constant polar angle θ, θ = constant .

(9.51)

Constant latitude orbits occur where the angular potential Θ, equation (9.43b), not only vanishes, but is an extremum, dΘ =0, (9.52) Θ= dθ the derivative being taken with the constants of motion E, Lz , and Q of the orbit being held fixed. The condition Θ = 0 simply sets the value of the Carter integral Q. Solving dΘ/dθ = 0 yields the condition between energy E and angular momentum Lz r L2 (9.53) E = ± 1+ 2 z4 . a sin θ Solutions at any polar angle θ and any angular momentum Lz exist, ranging from E = ±1 at Lz = 0, to E = ±Lz /(a sin2 θ) at Lz → ±∞. The solutions with Lz = 0 are those of the freely-falling observers that define the Doran coordinate system, §9.13. The solutions with Lz → ∞ define the principal null congruences discussed in §9.18.

9.18 Principal null congruence A congruence is a space-filling, non-overlapping set of geodesics. In the Kerr-Newman geometry there is a special set of null geodesics, the ingoing and outgoing principal null congruences, with respect to which the symmetries of the geometry are especially apparent. Photons that hold steady on the horizon are members of the outgoing principal null congruence. The energy-momentum tensor is diagonal in a locally inertial frame aligned with the ingoing or outgoing principal null congruence. The Weyl tensor, decomposed into spin components in the locally inertial frame of the principal null congruences, contains only spin-0 components. The Boyer-Linquist metric (9.1) is specifically constructed so that the Boyer Linquist tetrad is aligned with the principal null tetrad. Along the principal null congruences, the final two terms of the Boyer-Linquist

124

Kerr-Newman Black Hole

line element (9.1) vanish dθ = dφ −

a dt = 0 . R2

(9.54)

Solving the null condition ds2 = 0 on the rest of the metric yields the photon 4-velocity v µ ≡ dxµ /dλ on the principal null congruences R2 a , v r = ±1 , v θ = 0 , v φ = . (9.55) vt = ∆ ∆ In the regions outside the outer horizon or inside the inner horizon, the ± sign in front of v r is + for outgoing, − for ingoing geodesics. Between the outer and inner horizons, v r is negative in the Black Hole region, and positive in the White Hole region, while v t and v φ are negative for ingoing, positive for outgoing geodesics. The angular momentum per unit energy Jz ≡ Lz /|E| of photons along the principal null congruences is not zero, but is Jz = a sin2 θ

(9.56)

with the same sign for both ingoing and outgoing geodesics.

9.19 Circular orbits in the Kerr-Newman geometry An orbit can be termed circular if it is at constant radius r, r = constant .

(9.57)

It is convenient to call such an orbit circular even if the orbit is at finite inclination (not confined to the equatorial plane) about a rotating black hole, and therefore follows the surface of a spheroid (in BoyerLindquist coordinates). Orbits turn around in r, reaching periapsis or apoapsis, where the radial potential R, equation (9.43a), vanishes. Circular orbits occur where the radial potential R not only vanishes, but is an extremum, dR =0, (9.58) dr the derivative being taken with the constants of motion E, Lz , and Q of the orbit being held fixed. Circular orbits may be either stable or unstable. The stability of a circular orbit is determined by the sign of the second derivative of the potential d2 R , (9.59) dr2 R=

with + for stable, − for unstable circular orbits. Marginally stable orbits occur where d2 R/dr2 = 0. Circular orbits occur not only in the equatorial plane, but at general inclinations. The inclination of an orbit can be characterized by the minimum polar angle θmin to which it extends. An astronomer would call π/2 − θmin the inclination angle of the orbit. It is convenient to define an inclination parameter α by α ≡ cos2 θmin ,

(9.60)

9.19 Circular orbits in the Kerr-Newman geometry

125

which lies in the interval [0, 1]. Equatorial orbits, at θ = π/2, correspond to α = 0, while polar orbits, those that go over the poles at θ = 0 and π, correspond to α = 1.

9.19.1 General solution for circular orbits The general solution for circular orbits of a test particle of arbitrary electric charge q in the Kerr-Newman geometry is as follows. The rest mass m of the test particle can be set equal to unity, m = 1, without loss of generality. Circular orbits of particles with zero rest mass, m = 0, discussed later in this section, occur in cases where the circular orbits for massive particles attain infinite energy and angular momentum. In the radial potential R, equation (9.43a), eliminate the Carter integral Q in favour of the inclination parameter α, equation (9.60), using equation (9.43b)   L2z 2 2 Q = α a (1 − E ) + . (9.61) 1−α Furthermore, eliminate the energy E in favour of P , equation (9.38). The radial derivatives dn R/drn must be taken before E is replaced by P , since E is a constant of motion, whereas P varies with r. The physical motivation for replacing E with P lies in the sign of P . Solutions with positive P correspond to orbits in the Universe, Wormhole, or Antiverse parts of the Kerr-Newman geometry in the Penrose diagram of Figure 9.3, while solutions with negative P correspond to orbits in their Parallel counterparts. If only the Universe region is considered, then P is necessarily positive. By contrast, the energy E can be either positive or negative in the same region of the Kerr-Newman geometry (the energy E is negative for orbits of sufficiently large negative angular momentum Lz inside the ergosphere of the Universe). The condition R = 0 is a quadratic equation in Lz , whose solutions are i h p 1 2 2 /∆ − (r2 + a2 α)] . (1 − α) [P (9.62) a(1 − α)(P + qQr) ± R Lz = 2 r + a2 α Substituting the two (±) expressions (9.62) for Lz into dR/dr, and setting the product of the resulting two expressions for dR/dr equal to zero, yields a quartic equation for P/∆: p0 + p1 (P/∆) + p2 (P/∆)2 + p3 (P/∆)3 + p4 (P/∆)4 = 0 ,

(9.63)

with coefficients p0 ≡ r2 (r2 + a2 α)2 , 2

(9.64a)

2

2

2

p1 ≡ − 2qQr(r − a α)(r + a α) , 2

(9.64b)

2

2

2

2

2

2

2

2

2

2

2

2

2

2

2

p2 ≡ − 2r (r + a α)(r − 3M r + 2Q + a α + a αM/r) + q Q (r − a α) , 2

2

2

p3 ≡ 2qQr(r − a α)(r − 3M r + 2Q + 2a − a α + a αM/r) ,  p4 ≡ r6 − 6M r5 + (9M 2 +4Q2 +2a2 α)r4 − 4M (3Q2 +a2 )r3

 + (4Q4 −6a2 αM +4a2 Q2 +a4 α2 )r2 + 2a2 α(2Q2 +2a2 −a2 α)M r + a4 α2 M 2 .

(9.64c) (9.64d) (9.64e)

126

Kerr-Newman Black Hole

The quartic (9.63) is the condition for an orbit at radius r to be circular. Physical solutions must be real. The quartic (9.63) has either zero, two, or four real solutions at any one radius r. Numerically, it is better to solve the quartic (9.63) for the reciprocal ∆/P rather than P/∆, since the vanishing of 1/P defines the location of circular orbits of massless particles.

3 2

−1 −2 −3 −3 −2 −1 0

Outer horizon

Inner horizon

1 ∆/P 0

1

2 3 4 5 Radius r/M

6

7

8

9

Figure 9.6 Values of ∆/P for circular orbits at radius r of a charged particle about a Kerr-Newman black hole. The values ∆/P are real roots of the quartic (9.63); there are either zero, two, or four real roots at any one radius. The parameters are representative: a particle of charge-to-mass q/m = 2.4 on an orbit of inclination parameter α = 0.5 about a black hole of charge Q = 0.5M and spin parameter a = 0.5M . Solid (green) lines indicate stable orbits; dashed (brown) lines indicate unstable orbits. Positive ∆/P orbits occur in Universe, Wormhole, and Antiverse regions; negative ∆/P orbits occur in their Parallel counterparts; zero ∆/P orbits are null. The fact that the particle is charged breaks the symmetry between positive and negative ∆/P . If the charge of the particle were flipped, q/m = −2.4, then the diagram would be reflected about the horizontal axis (the sign of ∆/P would flip).

The angular momentum Lz , energy E, and stability d2 R/dr2 of a circular orbit are, in terms of a solution P/∆ of the quartic (9.63),

Lz = ±

r2

p 1 (1 − α) [l−1 (∆/P ) + l0 + l1 (P/∆) + l2 (P/∆)2 ] , 2 +a α

1 [(∆/P ) + qQ/r + (1 − M/r)(P/∆)] , 2   2 d2 R = 2 q−1 (∆/P ) + q0 + q1 (P/∆) + q2 (P/∆)2 , 2 2 2 dr (r + a α) E=

(9.65a) (9.65b) (9.65c)

9.19 Circular orbits in the Kerr-Newman geometry

127

where the coefficients li and qi are l−1 = qQrR2 (r2 + a2 α) , 2

and

2

2

(9.66a) 2

2

2

4

4

l0 = − R (r + a α)(2M r − Q ) − q Q (r − a α) ,  l1 = − qQr 2r4 − 5M r3 + 3(Q2 +a2 )r2 − a2 (1+α)M r + a2 (Q2 +αQ2 +a2 −a2 α)  + 3a4 αM/r − a4 α(Q2 +a2 )/r2 ,   l2 = 3M r3 − 2Q2 r2 + a2 (1+α)M r − a2 (1+α)Q2 − a4 αM/r ∆ , q−1 = 2qQr(r2 − a2 α)(r2 + a2 α) , 2

2

3

2 2

(9.66b) (9.66c) (9.66d)

(9.67a) 2

2

2

2

2

2

q0 = − 4(r + a α)(M r − Q r − a αM r) − q Q (r − a α) ,  q1 = − qQr r4 − 4M r3 + 3(Q2 +a2 −2a2 α)r2 + 12a2 αM r − a2 α(6Q2 +6a2 −a2 α)  − a4 α2 (Q2 +a2 )/r2 ,  q2 = 3M r3 − 4Q2 r2 − 6a2 αM r − a4 α2 M/r ∆ .

(9.67b) (9.67c) (9.67d)

The sign of the angular momentum Lz in equation (9.65a) should be chosen such that the relations (9.38) for P and (9.65b) for E hold. This choice of sign becomes ambiguous for a = 0; but this is as it should be, since either sign of Lz is valid for a = 0, where the black hole is spherically symmetric, and therefore defines no preferred direction. The expressions (9.62) and (9.65a) for Lz are equal on a circular orbit. The advantage of the latter expression (9.65a) will become apparent below, where it is found that for particles of zero electric charge, q = 0, one circular orbit is always prograde, aLz > 0, while the other is always retrograde, aLz < 0. For non-zero a, the reality of a solution P/∆ of the quartic (9.63) is a necessary and sufficient condition for a corresponding circular orbit to exist. In particular, the argument of the square root in the expression (9.65a) for Lz is guaranteed to be positive. For zero a, however, the quartic (9.65a), which reduces in this case to the square of a quadratic, admits real solutions that do not correspond to a circular orbit. For these invalid solutions, the argument of the square root in the expression (9.65a) for Lz is negative. Thus for zero a, a necessary and sufficient condition for a circular orbit to exist is that the solutions for both P/∆ and Lz be real.

9.19.2 Circular orbits for massless particles Circular orbits for massless particles, m = 0, or null circular orbits, follow from the solutions for massive particles in the case where the energy and angular momentum on the circular orbit become infinite, which occurs when P → ±∞. Except at horizons, where ∆ = 0, the solution for P from the quartic (9.63) diverges when the ratio p4 /p0 of the highest to lowest order coefficients vanishes. The ratio p4 /p0 , equations (9.64), factors as F+ F− , (9.68) p4 /p0 = 2 (r + a2 α)2

128

Kerr-Newman Black Hole

where p F± ≡ r2 − 3M r + 2Q2 + a2 α(1 + M/r) ± 2a (1 − α)(M r − Q2 − a2 αM/r) .

(9.69)

F+ = 0 or F− = 0 ,

(9.70)

A null circular orbit thus occurs at a radius r such that

with + for prograde (aLz > 0) orbits, − for retrograde (aLz < 0) orbits. The location of null circular orbits are independent of the charge q of the particle, since F± are independent of charge q. The angular momentum Jz per unit energy on the null circular orbit is, from equations (9.65a) and (9.65b) in the limit P → ±∞, p 2 (1 − α)l2 Jz ≡ Lz /|E| = ± 2 . (9.71) (r + a2 α)2 (1 − M/r) The case where F+ or F− vanishes at a horizon is special. This occurs when the black hole is extremal, M 2 = Q2 + a2 . A circular orbit exists at the horizon of an extremal black hole provided that the charge squared Q2 and inclination parameter α are not too large, the precise condition being a4 α2 + 6(Q2 +a2 )α − (Q2 +a2 )(Q2 −3a2 ) ≤ 0 .

(9.72)

The circular orbit is non-null, since the vanishing of ∆/P no longer implies√that P diverges if ∆ = 0, as is true on the horizon. A careful analysis shows that the limiting value of P/ ∆ is finite for a circular orbit at the horizon of an extremal black hole, so in fact P = 0 for such an orbit. Since there are null geodesics, the ingoing or outgoing principal null geodesics, that hold steady on the horizon, one might have expected that there would always be solutions for null circular orbits on the horizon, but this is false. The resolution of the paradox is that massless particles experience no proper time along their geodesics. If a massive particle is put on the horizon on a relativistic geodesic, then the massive particle necessarily falls off the horizon in a finite proper time: it is impossible for the geodesic to hold steady on the horizon. The only exception is that, as discussed in the previous paragraph, an extremal black hole may have circular orbits at its horizon; but these orbits have P = 0, and are not null.

9.19.3 Circular orbits for particles with zero electric charge For a particle with zero electric charge, q = 0, the quartic condition (9.63) for a circular orbit reduces to a quadratic in (P/∆)2 . Solving the quadratic for the reciprocal (∆/P )2 yields two possible solutions (∆/P )2 =

r2

F± , + a2 α

(9.73)

where F± are defined by equation (9.69), with + for prograde (aLz > 0) orbits, − for retrograde (aLz < 0) orbits. The sign of P is positive in the Universe, Wormhole, and Antiverse of Figure 9.3, negative in their Parallel counterparts. For zero electric charge, the expressions (9.65) for the angular momentum Lz , energy

9.19 Circular orbits in the Kerr-Newman geometry E, and stability d2 R/dr2 of a circular orbit simplify to p 1 (1 − α) [l0 + l2 (P/∆)2 ] , Lz = ± 2 r + a2 α 1 E = [(∆/P ) + (1 − M/r)(P/∆)] , 2   2 d2 R = 2 q0 + q2 (P/∆)2 . 2 2 2 dr (r + a α)

129

(9.74a) (9.74b) (9.74c)

The coefficients li and qi in equations (9.74) reduce from the expressions (9.66) and (9.67) to

and

l0 = − R2 (r2 + a2 α)(2M r − Q2 ) ,   l2 = 3M r3 − 2Q2 r2 + a2 (1+α)M r − a2 (1+α)Q2 − a4 αM/r ∆ , q0 = − 4(r2 + a2 α)(M r3 − Q2 r2 − a2 αM r) ,  q2 = 3M r3 − 4Q2 r2 − 6a2 αM r − a4 α2 M/r ∆ .

(9.75a) (9.75b)

(9.76a) (9.76b)

2.0

1.5 1.0 P

.5 .0

−1.0 −1.5 −2.0

.0

.5

Outer horizon

Inner horizon

−.5

1.0 Radius r/M

1.5

2.0

Figure 9.7 Values of P for circular orbits at radius r in the equatorial plane of a near-extremal Kerr black hole, with black hole spin parameter a = 0.999M . The diagram illustrates that as the orbital radius r approaches the horizon, P first approaches zero, but then increases sharply to infinity, corresponding to null circular orbits. In the case of an exactly extremal black hole, P goes as to zero at the horizon, there is no increase of P to infinity, and no null circular orbit. Solid (green) lines indicate stable orbits; dashed (brown) lines indicate unstable orbits.

9.19.4 Equatorial circular orbits in the Kerr geometry The case of greatest practical interest to astrophysicists is that of circular orbits in the equatorial plane of an uncharged black hole, the Kerr geometry.

130

Kerr-Newman Black Hole

For circular orbits in the equatorial plane, α = 0, of an uncharged black hole, Q = 0, the solution (9.73) simplifies to F± (∆/P )2 = 2 (9.77) r where F± , equation (9.69), reduce to

√ F± ≡ r2 − 3M r ± 2a M r ,

(9.78)

with + for prograde (aLz > 0) orbits, − for retrograde (aLz < 0) orbits. As discussed above, null circular orbits occur where F± = 0, except in the special case that the circular orbit is at the horizon, which occurs when the black hole is extremal. In the limit where the Kerr black hole is near but not exactly extremal, a → |M |, null circular orbits occur at r → M (prograde) and r → 4M (retrograde). For an exactly extremal Kerr black hole, a = |M |, the (prograde) circular orbit at the horizon is no longer null. The situation of a near extremal Kerr black hole is illustrated by Figure 9.7. It is generally argued that the inner edge of an accretion disk is likely to occur at the innermost stable equatorial circular orbit. An orbit at this point has marginal stability, d2 R/dr2 = 0. Simplifying the stability d2 R/dr2 from equation (9.74c) to the case of equatorial orbits, α = 0, and zero black hole charge, Q = 0, yields the condition of marginal stability √ (9.79) r2 − 6M r − 3a2 ± 8a M r = 0 . The + (prograde) orbit has the smaller radius, and so defines the innermost stable circular orbit. For an extremal Kerr black hole, a = |M |, marginally stable circular equatorial orbits are at r = M (prograde) and r = 9M (retrograde).

9.19.5 Circular orbits in the Reissner-Nordstr¨ om geometry Circular orbits of particles in the Reissner-Nordstr¨ om geometry follow from those in the Kerr-Newman geometry in the limit of a non-rotating black hole, a = 0. For a non-rotating black hole, an orbit can be taken without loss of generality to circulate right-handedly in the equatorial plane, θ = π/2, so that α = 0 and the azimuthal angular momentum Lz equals the positive total angular momentum L. For non-equatorial √ orbits, the relation between azimuthal and total angular momentum is Lz = ± 1 − α L. For a non-rotating black hole, a = 0, the quartic condition (9.63) for a circular orbit of a particle of rest mass m = 1 and electric charge q reduces to the square of a quadratic,  (9.80) r2 − qQr(P/∆) − r2 − 3M r + 2Q2 (P/∆)2 = 0 . Solving the quadratic (9.80) for the reciprocal ∆/P yields two solutions r 2Q2 q 2 Q2 qQ 3M . ± 1− + 2 + ∆/P = 2r r r 4r2

(9.81)

The sign of P is positive in the Universe, Wormhole, and Antiverse parts of the Reissner-Nordstr¨ om geometry

9.19 Circular orbits in the Kerr-Newman geometry

131

in the Penrose diagram of Figure 8.1, negative in their Parallel counterparts. The angular momentum L, energy E, and stability d2 R/dr2 of a circular orbit are, in terms of a solution ∆/P of the quadratic (9.80), p (9.82) L = P 2 /∆ − r2 , P qQ E= 2+ , (9.83) r r    6M 6Q2 P 2 d2 R 2 2 2 2 − 2 1 − = 2 r − 6M r + 5Q + q Q + . (9.84) dr2 r r2 ∆ For massless particles, circular orbits occur where the solution (9.81) for ∆/P vanishes, which occurs when r2 − 3M r + 2Q2 = 0 ,

(9.85)

independent of the charge q of the particle. The condition (9.85) is consistent with the Kerr-Newman condition for a null circular orbit, the vanishing of F± given by equation (9.69). However, for Kerr-Newman, the argument of the square root on the right hand side of equation (9.69) for F± must be positive, even in the limit of infinitesimal a. In the limit of small a, this requires that M r − Q2 ≥ 0. If the charge Q of the Reissner-Nordstr¨ om black hole lies in the standard range 0 ≤ Q2 ≤ M 2 , then one of the solutions of the quadratic (9.85) lies outside the outer horizon, while the other lies between the outer and inner horizons. As one might hope, the additional condition M r − Q2 ≥ 0 eliminates the undesirable solution between the horizons, leaving only the solution outside the horizon, which is ! r 8Q2 3M 1+ 1− for 0 ≤ Q2 ≤ M 2 . (9.86) r= 2 9 In (unphysical) cases Q2 < 0 or M 2 < Q2 ≤ (9/8)M 2 , both solutions of equation (9.85) are valid.

PART FOUR HOMOGENEOUS, ISOTROPIC COSMOLOGY

Concept Questions

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.

What does it mean that the Universe is expanding? Does the expansion affect the solar system or the Milky Way? How far out do you have to go before the expansion is evident? What is the Universe expanding into? In what sense is the Hubble constant constant? Does our Universe have a center, and if so where is it? What evidence suggests that the Universe at large is homogeneous and isotropic? How can the CMB be construed as evidence for homogeneity and isotropy given that it provides information only about a 2D surface on the sky? What is thermodynamic equilibrium? What evidence suggests that the early Universe was in thermodynamic equilibrium? What are cosmological parameters? What cosmological parameters can or cannot be measured from the power spectrum of fluctuations of the CMB? FRW Universes are characterized as closed, flat, or open. Does flat here mean the same as flat Minkowski space? What is it that astronomers call dark matter? What is the primary evidence for the existence of non-baryonic cold dark matter? How can astronomers detect dark matter in galaxies or clusters of galaxies? How can cosmologists claim that the Universe is dominated by not one but two distinct kinds of mysterious mass-energy, dark matter and dark energy, neither of which has been observed in the laboratory? What key property or properties distinguish dark energy from dark matter? Does the Universe conserve entropy? Does the annihilation of electron-positron pairs into photons generate entropy in the early Universe, as its temperature cools through 1 MeV? How does the wavelength of light change with the expansion of the Universe? How does the temperature of the CMB change with the expansion of the Universe?

136

Concept Questions

22. How does a blackbody (Planck) distribution change with the expansion of the Universe? What about a non-relativistic distribution? What about a semi-relativistic distribution? 23. What is the horizon of our Universe? 24. What happens beyond the horizon of our Universe? 25. What caused the Big Bang? 26. What happened before the Big Bang? 27. What will be the fate of the Universe?

What’s important?

1. The CMB indicates that the early (≈ 400,000 year old) Universe was (a) uniform to a few ×10−5 , and (b) in thermodynamic equilibrium. This indicates that the Universe was once very simple . It is this simplicity that makes it possible to model the early Universe with some degree of confidence. 2. The power spectrum of fluctuations of the CMB has enabled precise measurements of cosmological parameters, excepting the Hubble constant. 3. There is a remarkable concordance of evidence from a broad range of astronomical observations — supernovae, big bang nucleosynthesis, the clustering of galaxies, the abundances of clusters of galaxies, measurements of the Hubble constant from Cepheid variables, the ages of the oldest stars. 4. Observational evidence is consistent with the predictions of the theory of inflation in its simplest form — the expansion of the Universe, the spatial flatness of the Universe, the near uniformity of temperature fluctuations of the CMB (the horizon problem), the presence of acoustic peaks and troughs in the power spectrum of fluctuations of the CMB, the near power law shape of the power spectrum at large scales, its spectral index (tilt), the gaussian distribution of fluctuations at large scales. 5. What is non-baryonic dark matter? 6. What is dark energy? What is its equation of state w ≡ p/ρ, and how does w evolve with time?

10 Homogeneous, Isotropic Cosmology

10.1 Observational basis Since 1998, observations have converged on a Standard Model of Cosmology, a spatially flat universe dominated by dark energy and by non-baryonic dark matter. 1. The Hubble diagram (distance versus redshift) of galaxies indicates that the Universe is expanding (Hubble 1929). 2. The Cosmic Microwave Background (CMB). • Near black body spectrum, with T0 = 2.725 ± 0.001 K (Fixsen & Mather 2002). • Dipole ⇒ the solar system is moving at 365 km s−1 through the CMB. • After dipole subtraction, the temperature of the CMB over the sky is uniform to a few parts in 105 . • The power spectrum of temperature T fluctuations shows a scale-invariant spectrum at large scales, and prominent acoustic peaks at smaller scales. Allows measurement of the amplitude As and tilt ns of primordial fluctuations, the curvature density Ωk , and the proper densities Ωc h2 of non-baryonic cold dark matter and Ωb h2 of baryons. Does not measure Hubble constant h ≡ H0 /(100 km s−1 Mpc−1 ). • The power spectra of E and B polarization fluctuations, and the various cross power spectra (only T -E should be non-vanishing). 3. The Hubble diagram of Type Ia (thermonuclear) supernovae indicates that the Universe is accelerating. This points to the dominance of gravitationally repulsive dark energy, with ΩΛ ≈ 0.75. The amount of dark energy is consistent with observations from the CMB indicating that the Universe is spatially flat, Ω ≈ 1, and observations from CMB, galaxy clustering, and clusters of galaxies indicating that the density in gravitationally attractive matter is only Ωm ≈ 0.25. 4. Observed abundances of light elements H, D, 3 He, He, and Li are consistent with the predictions of big bang nucleosynthesis (BBN) provideed that Ωb ≈ 0.04, in good agreement with measurements from the CMB. 5. The clustering of matter (dark and bright) shows a power spectrum in good agreement with the Standard Model: • galaxies; • the Lyman alpha forest;

10.2 Cosmological Principle

139

• gravitational lensing. Historically, the principle evidence for non-baryonic cold dark matter is comparison between the power spectra of galaxies versus CMB. How can tiny fluctuations in the CMB grow into the observed fluctuations in matter today in only the age of the Universe? Answer: non-baryonic dark matter that begins to cluster before Recombination, when the CMB was released. 6. The abundance of galaxy clusters as a function of redshift. 7. The ages of the oldest stars, in globular clusters. The Hubble constant yields an estimate of the age of the Universe that is older with dark energy than without. The ages of the oldest stars agree with the age of the Universe with dark enery, but are older than the Universe without dark energy. 8. Ubiquitous evidence for dark matter, deduced from sizes and velocities (or in the case of gravitational lensing, the gravitational potential) of various objects. • The Local Group of galaxies. • Rotation curves of spiral galaxies. • The temperature and distribution of x-ray gas in elliptical galaxies. • The temperature and distribution of x-ray gas in clusters of galaxies. • Gravitational lensing by clusters of galaxies. 9. The Bullet cluster is a rare example that supports the notion that the dark matter is non-baryonic. In the Bullet cluster, two clusters recently passed through each other. The baryonic matter, as measured from x-ray emission of hot gas, appears displaced from the dark matter, as measured from weak gravitational lensing.

10.2 Cosmological Principle The cosmological principle states that the Universe at large is • homogeneous (has spatial translation symmetry), • isotropic (has spatial rotation symmetry). The primary evidence for this is the uniformity of the temperature of the CMB, which, after subtraction of the dipole produced by the motion of the solar system through the CMB, is constant over the sky to a few parts in 105 . Confirming evidence is the statistical uniformity of the distribution of galaxies over large scales. The cosmological principle allows that the Universe evolves in time, as observations surely indicate — the Universe is expanding, galaxies, quasars, and galaxy clusters evolve with redshift, and the temperature of the CMB is undoubtedly decreasing as the Universe expands.

140

Homogeneous, Isotropic Cosmology

10.3 Friedmann-Robertson-Walker metric Universes satisfying the cosmological principle are described by the Friedmann-Robertson-Walker (FRW) metric, equation (10.25) below. The metric, and the associated Einstein equations, which are known as the Friedmann equations, are set forward in the next several sections, §§10.4–10.9.

10.4 Spatial part of the FRW metric: informal approach The cosmological principle implies that the spatial part of the FRW metric is a 3D hypersphere

(10.1)

where in this context the term hypersphere is to be construed as including not only cases of positive curvature, which have finite positive radius of curvature, but also cases of zero and negative curvature, which have infinite and imaginary radius of curvature. w φ

r // = Rχ

r=R sinχ

χ Rd

R

φ

d Rsinχ

χ

y x

Figure 10.1 Embedding diagram of the FRW geometry.

Figure 10.1 shows an embedding diagram of a 3D hypersphere in 4D Euclidean space. The horizontal directions in the diagram represent the normal 3 spatial x, y, z dimensions, with one dimension z suppressed,

10.4 Spatial part of the FRW metric: informal approach

141

while the vertical dimension represents the 4th spatial dimension w. The 3D hypersphere is a set of points {x, y, z, w} satisfying 1/2 = R = constant . (10.2) x2 + y 2 + z 2 + w2

An observer is sitting at the north pole of the diagram, at {0, 0, 0, 1}. A 2D sphere (which forms a 1D circle in the embedding diagram of Figure 10.1) at fixed distance surrounding the observer has geodesic distance rk defined by rk ≡ proper distance to sphere measured along a radial geodesic ,

(10.3)

and circumferential radius r defined by r ≡ x2 + y 2 + z 2

1/2

,

(10.4)

which has the property that the proper circumference of the sphere is 2πr. In terms of rk and r, the spatial metric is dl2 = drk2 + r2 do2

(10.5)

where do2 ≡ dθ2 + sin2 θ dφ2 is the metric of a unit 2-sphere. Introduce the angle χ illustrated in the diagram. Evidently rk = Rχ , r = R sin χ .

(10.6)

In terms of the angle χ, the spatial metric is dl2 = R2 dχ2 + sin2 χ do2



(10.7)

which is one version of the spatial FRW metric. The metric resembles the metric of a 2-sphere of radius R, which is not surprising since the same construction, with Figure 10.1 interpreted as the embedding diagram of a 2D sphere in 3D, yields the metric of a 2-sphere. Indeed, the construction iterates to give the metric of an n-dimensional sphere of arbitrarily many dimensions n. Instead of the angle χ, the metric can be expressed in terms of the circumferential radius r. It follows from equations (10.6) that rk = R sin−1 (r/R)

(10.8)

whence

where K is the curvature

dr drk = p 1 − r2 /R2 dr = √ 1 − Kr2 K≡

1 . R2

(10.9)

(10.10)

142

Homogeneous, Isotropic Cosmology

In terms of r, the spatial FRW metric is then dl2 =

dr2 + r2 do2 . 1 − Kr2

(10.11)

The embedding diagram Figure 10.1 is a nice prop for the imagination, but it is not the whole story. The curvature K in the metric (10.11) may be not only positive, corresponding to real finite radius R, but also zero or negative, corresponding to infinite or imaginary radius R. The possibilities are called closed, flat, and open:   > 0 closed R real , K (10.12) = 0 flat R→∞,  < 0 open R imaginary .

10.5 Comoving coordinates The metric (10.11) is valid at any single instant of cosmic time t. As the Universe expands, the 3D spatial hypersphere (whether closed, flat, or open) expands. In cosmology it is highly advantageous to work in comoving coordinates that expand with the Universe. Why? First, it is helpful conceptually and mathematically to think of the Universe as at rest in comoving coordinates. Second, linear perturbations, such as those in the CMB, have wavelengths that expand with the Universe, and are therefore fixed in comoving coordinates. In practice, cosmologists introduce the cosmic scale factor a(t) a(t) ≡ measure of the size of the Universe, expanding with the Universe

(10.13)

which is proportional to but not necessarily equal to the radius R of the Universe. The cosmic scale factor a can be normalized in any arbitrary way. The most common convention adopted by cosmologists is to normalize it to unity at the present time, a0 = 1 ,

(10.14)

where the 0 subscript conventionally signifies the present time. Comoving geodesic and circumferential radial distances xk and x are defined in terms of the proper geodesic and circumferential radial distances rk and r by axk ≡ rk ,

ax ≡ r .

(10.15)

Objects expanding with the Universe remain at fixed comoving positions xk and x. In terms of the comoving circumferential radius x, the spatial FRW metric is   dx2 2 2 + x do dl2 = a2 , (10.16) 1 − κx2

10.6 Spatial part of the FRW metric: more formal approach

143

where the curvature constant κ, a constant in time and space, is related to the curvature K, equation (10.10), by κ ≡ a2 K .

(10.17)

Alternatively, in terms of the geodesic comoving radius xk , the spatial FRW metric is   dl2 = a2 dx2k + x2 do2 ,

(10.18)

where

 sin(κ1/2 xk )      κ1/2 xk x=   sinh(|κ|1/2 xk )    |κ|1/2

κ > 0 closed , κ = 0 flat ,

(10.19)

κ < 0 open .

For some purposes it is convenient to normalize the cosmic scale factor a so that κ = 1, 0, or −1. In this case the spatial FRW metric may be written  dl2 = a2 dχ2 + x2 do2 , (10.20) where

  sin(χ) x= χ  sinh(χ)

κ=1 closed , κ=0 flat , κ = −1 open .

(10.21)

Exercise 10.1 By a suitable transformation of the comoving radial coordinate x, bring the spatial FRW metric (10.16) to the “isotropic” form dl2 =

a2 1 + 14 κX 2

What is the relation between X and x?

2 2 2 2 dX + X do



.

(10.22)

10.6 Spatial part of the FRW metric: more formal approach A more formal approach to the derivation of the spatial FRW metric from the cosmological principle starts with the proposition that the spatial components Gij of the Einstein tensor at fixed scale factor a (all time derivatives of a set to zero) should be proportional to the metric tensor Gij = K gij

(i, j = 1, 2, 3) .

(10.23)

144

Homogeneous, Isotropic Cosmology

Without loss of generality, the spatial metric can be taken to be of the form dl2 = f (r) dr2 + r2 do2 .

(10.24)

Imposing the condition (10.23) on the metric (10.24) recovers the spatial FRW metric (10.11).

10.7 FRW metric The full Friedmann-Robertson-Walker spacetime metric is 2

2

2

ds = − dt + a(t)



dx2 + x2 do2 1 − κx2



(10.25)

where t is cosmic time, which is the proper time experienced by comoving observers, who remain at rest in comoving coordinates dx = dθ = dφ = 0. Any of the alternative versions of the comoving spatial FRW metric, equations (10.16), (10.18), (10.20), or (10.22). may be used as the spatial part of the FRW spacetime metric (10.25).

10.8 Einstein equations for FRW metric The Einstein equations for the FRW metric (10.25) are   κ a˙ 2 t = 8πGρ , + 2 −Gt = 3 a2 a κ a˙ 2 2a ¨ Gxx = Gθθ = Gφφ = − 2 − 2 − = 8πGp , a a a

(10.26)

where overdots represent differentiation with respect to cosmic time t, so that for example a˙ ≡ da/dt. Note the trick of one index up, one down, to remove, modulo signs, the distorting effect of the metric on the Einstein tensor. The Einstein equations (10.26) rearrange to give Friedmann’s equations a˙ 2 a2 a ¨ a

= =

8πGρ κ − 2 , 3 a 4πG − (ρ + 3p) . 3

(10.27)

Friedmann’s two equations (10.27) are fundamental to cosmology. The first one relates the curvature κ of the Universe to the expansion rate a/a ˙ and the density ρ. The second one relates the acceleration a ¨/a to the density ρ plus 3 times the pressure p.

10.9 Newtonian “derivation” of Friedmann equations

145

10.9 Newtonian “derivation” of Friedmann equations 10.9.1 Energy equation Model a piece of the Universe as a ball of radius a and mass M = 43 πρa3 . Consider a small mass m attracted by this ball. Conservation of the kinetic plus potential energy of the small mass m implies 1 GM m κmc2 ma˙ 2 − =− , (10.28) 2 a 2 where the quantity on the right is some constant whose value is not determined by this Newtonian treatment, but which GR implies is as given. The energy equation (10.28) rearranges to 8πGρ κc2 a˙ 2 = − 2 , a2 3 a

(10.29)

which reproduces the first Friedmann equation.

10.9.2 First law of thermodynamics For adiabatic expansion, the first law of thermodynamics is dE + p dV = 0 .

(10.30)

With E = ρV and V = 34 πa3 , the first law (10.30) becomes d(ρa3 ) + p da3 = 0 ,

(10.31)

or, with the derivative taken with respect to cosmic time t, a˙ =0. a

(10.32)

8πGρa2 − κc2 3

(10.33)

ρ˙ + 3(ρ + p) Differentiating the first Friedmann equation in the form a˙ 2 = gives

 8πG ρa ˙ 2 + 2ρaa˙ , 3 and substituting ρ˙ from the first law (10.32) reduces this to 2a¨ ˙a =

2a¨ ˙a =

8πG aa˙ (− ρ − 3p) . 3

(10.34)

(10.35)

Hence a ¨ 4πG =− (ρ + 3p) , a 3 which reproduces the second Friedmann equation.

(10.36)

146

Homogeneous, Isotropic Cosmology

10.9.3 Comment on the Newtonian derivation The above Newtonian derivation of Friedmann’s equations is only heuristic. A different result could have been obtained if different assumptions had been made. If for example the Newtonian gravitational force law m¨ a = −GM m/a2 were taken as correct, then it would follow that a ¨/a = − 34 πGρ, which is missing the all important 3p contribution (without which there would be no inflation or dark energy) to Friedmann’s second equation. It is notable that the first law of thermodynamics is built in to the Friedmann equations. This implies that entropy is conserved in FRW Universes. This remains true even when the mix of particles changes, as happens for example during the epoch of electron-positron annihilation, or during big bang nucleosynthesis. How then does entropy increase in the real Universe? Through fluctuations away from the perfect homogeneity and isotropy assumed by the FRW metric.

10.10 Hubble parameter The Hubble parameter H(t) is defined by H≡

a˙ . a

(10.37)

The Hubble parameter H varies in cosmic time t, but is constant in space at fixed cosmic time t. The value of the Hubble parameter today is called the Hubble constant H0 (the subscript 0 signifies the present time). The Hubble constant is measured from Cepheids and Type Ia supernova to be (Riess et al. 2005, astro-ph/0503159) H0 = 73 ± 4(stat) ± 5(sys) km s−1 Mpc−1 .

(10.38)

The distance d to an object that is receding with the expansion of the universe is proportional to the cosmic scale factor, d ∝ a, and its recession velocity v is consequently proportional to a. ˙ The result is Hubble’s law relating the recession velocity v and distance d of distant objects v = H0 d .

(10.39)

Since it takes light time to travel from a distant object, and the Hubble parameter varies in time, the linear relation (10.39) breaks down at cosmological distances. We, in the Milky Way, reside in an overdense region of the Universe that has collapsed out of the general Hubble expansion of the Universe. The local overdense region of the Universe that has just turned around from the general expansion and is beginning to collapse for the first time is called the Local Group of galaxies. The Local Group consists of about 40 or so galaxies, mostly dwarf and irregular galaxies. It contains two major spiral galaxies, Andromeda (M31) and the Milky Way, and one mid-sized spiral galaxy Triangulum (M33). The Local Group is about 1 Mpc in radius.

10.11 Critical density

147

Because of the ubiquity of the Hubble constant in cosmological studies, cosmologists often parameterize it by the quantity h defined by h≡

H0 . 100 km s−1 Mpc−1

(10.40)

10.11 Critical density The critical density ρcrit is defined to be the density required for the Universe to be flat, κ = 0. According to the first of Friedmann equations (10.27), this sets ρcrit ≡

3H 2 . 8πG

(10.41)

The critical density ρcrit , like the Hubble parameter H, evolves with time.

10.12 Omega Cosmologists designate the ratio of the actual density ρ of the Universe to the critical density ρcrit by the fateful letter Ω, the final letter of the Greek alphabet, Ω≡

ρ . ρcrit

(10.42)

With no subscript, Ω denotes the total mass-energy density in all forms. A subscript x on Ωx denotes mass-energy density of type x. The curvature density ρk , which is not really a form of mass-energy but it is sometimes convenient to treat

Table 10.1 Cosmic inventory Species Dark energy (Λ) Non-baryonic cold dark matter (CDM) Baryonic matter Neutrinos Photons (CMB)

(2008) ΩΛ Ωc Ωb Ων Ωγ

0.72 ± 0.02 0.234 ± 0.02 0.046 ± 0.002 < 0.014 5 × 10−5

Total



1.005 ± 0.006

Curvature

Ωk

−0.005 ± 0.006

148

Homogeneous, Isotropic Cosmology

it as though it were, is defined by ρk ≡ −

3κc2 8πGa2

(10.43)

and correspondingly Ωk ≡ ρk /ρcrit . According to the first of Friedmann’s equations (10.27), the curvature density Ωk satisfies Ωk = 1 − Ω .

(10.44)

Table 10.1 gives 2008 measurements of Ω in various species, obtained by combining 5-year WMAP CMB measurements with a variety of other astronomical evidence, including supernovae, big bang nucleosynthesis, galaxy clustering, weak lensing, and local measurements of the Hubble constant H0 .

10.13 Redshifting The spatial translation symmetry of the FRW metric implies conservation of generalized momentum. As you will show in a problem set, a particle that moves along a geodesic in the radial direction, so that dθ = dφ = 0, has 4-velocity uν satisfying uxk = constant .

(10.45)

This conservation law implies that the proper momentum pk of a radially moving particle decays as pk ≡ ma

dxk 1 ∝ , dτ a

(10.46)

which is true for both massive and massless particles. It follows from equation (10.46) that light observed on Earth from a distant object will be redshifted by a factor a0 , (10.47) 1+z = a where a0 is the present day cosmic scale factor. Cosmologists often refer to the redshift of an epoch, since the cosmological redshift is an observationally accessible quantity that uniquely determines the cosmic time of emission.

10.14 Types of mass-energy The energy-momentum tensor Tµν of an FRW Universe is necessarily homogeneous and isotropic, by assumption of the cosmological principle, taking the form (note yet again the trick of one index up and one

10.15 Evolution of the cosmic scale factor down to remove the distorting effect of the metric)  0 0 Ttt 0  0 Tr 0 0  r µ Tν =  0 Tθθ 0  0 0 0 0 Tφφ



−ρ     0 = 0  0 

0 p 0 0

 0 0 0 0   . p 0  0 p

149

(10.48)

Table 10.2 gives equations of state p/ρ for generic species of mass-energy, along with (ρ + 3p)/ρ, which determines the gravitational attraction per unit energy, and how the mass-energy varies with cosmic scale factor, ρ ∝ an .

Table 10.2 Properties of universes dominated by various species Species Radiation Matter Curvature Vaccum

p/ρ

(ρ + 3p)/ρ

ρ∝

1/3 0 “−1/3” −1

2 1 “0” −2

a−4 a−3 a−2 a0

As commented in §23.16 above, the first law of thermodynamics for adiabatic expansion is built into Friedmann’s equations. In fact the law represents covariant conservation of energy-momentum for the system as a whole Dµ T µν = 0 .

(10.49)

As long as species do not convert into each other (for example, no annihilation), covariant energy-momentum conservation holds individually for each species, so the first law applies to each species individually, determining how its energy density ρ varies with cosmic scale factor a. Figure 10.2 illustrates how the energy densities ρ of various species evolve as a function of scale factor a.

10.15 Evolution of the cosmic scale factor Given how the energy density ρ of each species evolves with cosmic scale factor a, the first Friedmann equation then determines how the cosmic scale factor a(t) itself evolves with cosmic time t. The evolution equation for a(t) can be cast as an equation for the Hubble parameter H ≡ a/a, ˙ which in view of the definition (10.41) of the critical density can be written 1/2  ρcrit (t) H(t) . = H0 ρcrit (t0 )

(10.50)

150

Homogeneous, Isotropic Cosmology

Mass-energy density ρ

ργ ∝ a−4

ρ m ∝ a−3

ρ k ∝ a−2 ρΛ = constant

Cosmic scale factor a Figure 10.2 Behavior of the mass-energy density ρ of various species as a function of cosmic time t.

Given the definition (10.43) of the curvature density as the critical density minus the total density, the critical density ρcrit is itself the sum of the densities ρ of all species including the curvature density X ρcrit = ρk + ρx . (10.51) species x

Integrating equation (10.50) gives cosmic time t as a function of cosmic scale factor a Z da t= . aH

(10.52)

For example, in the case that the density is comprised of radiation, matter, and vacuum, the critical density is ρcrit = ργ + ρm + ρk + ρΛ ,

(10.53)

and equation (10.50) is 1/2 H(t) , = Ωγ a−4 + Ωm a−3 + Ωk a−2 + ΩΛ H0

where Ωx represents its value at the present time. The time t, equation (10.52), is then Z 1 da t= , 1/2 −4 −3 H0 a (Ωγ a + Ωm a + Ωk a−2 + ΩΛ )

(10.54)

(10.55)

10.16 Conformal time

151

which is an elliptical integral of the 3rd kind. If one single species in particular dominates the mass-energy density, then equation (10.55) integrates easily to give the results in the following table. Table 10.3 Evolution of cosmic scale factor in universes dominated by various species Dominant Species

a∝

Radiation Matter Curvature Vaccum

t1/2 t2/3 t eHt

10.16 Conformal time Especially when doing cosmological perturbation theory, it is convenient to use conformal time η defined by (with units c temporarily restored) a dη ≡ c dt

(10.56)

with respect to which the FRW metric is  ds2 = a(η)2 − dη 2 +

dx2 + x2 do2 1 − κx2



.

(10.57)

The term conformal refers to a metric that is multiplied by an overall factor, the conformal factor. In the FRW metric (10.57), the cosmic scale factor a is the conformal factor. Conformal time η has the property that the speed of light is one in conformal coordinates: light moves unit comoving distance per unit conformal time. In particular, light moving radially towards an observer at xk = 0, with dθ = dφ = 0, satisfies dxk = −1 . dη

(10.58)

10.17 Looking back along the lightcone Since light moves at unit velocity in conformal coordinates, an object at geodesic distance xk that emits light at conformal time ηemit is observed at conformal time ηobs given by xk = ηobs − ηemit .

(10.59)

152

Homogeneous, Isotropic Cosmology

The comoving geodesic distance xk to an object is Z tobs Z ηobs Z aobs Z z c dt c da c dz dη = xk = = = , 2 H temit a ηemit aemit a H 0

(10.60)

where the last equation assumes the relation 1 + z = 1/a, valid as long as a is normalized to unity at the observer (us) at the present time aobs = a0 = 1. In the case that the density is comprised of (curvature and) radiation, matter, and vacuum, equation (10.60) gives Z 1 da c , (10.61) xk = H0 1/(1+z) a2 (Ωγ a−4 + Ωm a−3 + Ωk a−2 + ΩΛ )1/2 which is an elliptical integral of the 1st kind. Given the geodesic comoving distance xk , the circumferential comoving distance x then follows as 1/2

x=

sinh(Ωk H0 xk /c) 1/2

Ωk H0 /c

.

(10.62)

To second order in redshift z, x ≈ xk ≈

  c  z − z 2 Ωγ + 43 Ωm + 12 Ωk + ... . H0

(10.63)

The geodesic and circumferential distances xk and x differ at order z 3 .

10.18 Horizon Light can come from no more distant point than the Big Bang. This distant point defines the horizon of our Universe, which is located at infinite redshift, z = ∞. Equation (10.60) gives the geodesic distance from us at redshift zero to the horizon as Z ∞ c dz xk (horizon) = (10.64) H 0 where again the cosmic scale factor has been normalized to unity at the present time, a0 = 1. Equation (10.64) formally defines the event horizon of the Universe, but the cosmological scale over which objects can continue to affect each other causally is typically smaller than this (much smaller, post-inflation). It is thus common to define the cosmological horizon distance at any time as cosmological horizon distance ≡

c H

which is roughly the scale over which objects can remain in causal contact. Exercise 10.2

Then versus now.

(10.65)

10.18 Horizon

1030

153

Age of the Universe (years) 10−50 10−40 10−30 10−20 10−10 100

1010

Size of the Universe (meters)

1020 1010 100 10−10 10−20 10−30 10−40

10−40 10−30 10−20 10−10 100 1010 Age of the Universe (seconds)

1020

Figure 10.3 Cosmic scale factor a and cosmological horizon distance c/H as a function of cosmic time t.

1. Prove that Z

0



 xn−1 dx = 1 − 21−n x e +1

Z

0



xn−1 dx . ex − 1

(10.66)

[Hint: Use the fact that (ex + 1)(ex − 1) = (e2x − 1).] Hence argue that the ratios of energy, entropy, and number densities of relativistic fermionic (f ) to relativistic bosonic (b) species in thermodynamic equilibrium at the same temperature are sf 7 ρf = = , ρb sb 8

3 nf = . nb 4

(10.67)

[Hint: The proper entropy density of each relativistic species is s = (ρ + p)/T = (4/3)ρ/T .] 2. Weak interactions were fast enough to keep neutrinos in thermodynamic equilibrium with photons, electrons, and positrons up to just before e¯ e annihilation, but then neutrinos decoupled. Argue that conservation of comoving entropy implies  7  a3 T 3 gγ + ge = Tγ3 gγ , (10.68a) 8 3 3 3 a T gν = Tν gν , (10.68b)

Homogeneous, Isotropic Cosmology

Radiation Temperature of the Universe (Kelvin)

154

1035

Age of the Universe (seconds) 10−40 10−30 10−20 10−10 100 1010

1030 1025 1020 1015 1010 105 100 10−5

10−50 10−40 10−30 10−20 10−10 100 Age of the Universe (years) Age of the Universe (seconds) 10−40 10−30 10−20 10−10 100 1010

Mass-Energy Density of the Universe (kg/m3)

1020

1010

1020

10100 1090 1080 1070 1060 1050 1040 1030 1020 1010 100 10−10 10−20 10−30

10−50 10−40 10−30 10−20 10−10 100 Age of the Universe (years)

1010

Figure 10.4 (Top) Temperature T , and (bottom) mass-energy density ρ, of the Universe as a function of cosmic time t.

10.18 Horizon

155

where the left hand sides refer to quantities before e¯ e annihilation, which happened at T ∼ 1 MeV ≈ 1010 K, and the right hand sides to quantities after e¯ e annihilation (including today). Deduce the ratio of neutrino to photon temperatures today, Tν . (10.69) Tγ Does the temperature ratio (10.69) depend on the number of neutrino types? What is the neutrino temperature today in K, if the photon temperature today is 2.725 K? 3. The energy, entropy, and number densities of relativistic particles today are, with units restored (energy density ρ in units energy/volume; entropy and number density s and n in units 1/volume), 2π 2 (kT0 )3 ζ(3)(kT0 )3 π 2 (kT0 )4 , s = g , n = g , (10.70) s, n, 0 0 30c3 ~3 45c3 ~3 π 2 c3 ~ 3 where T0 = 2.725 K is the CMB temperature today, ζ(3) = 1.2020569 is a Riemann zeta function, and gρ,0 , gs,0 , and gn,0 denote the energy-, entropy-, and number-weighted effective number of relativistic species today, normalized to 1 per bosonic degree of freedom. What are the arithmetic values of gρ,0 , gs,0 , and gn,0 if the relativistic species consist of photons and three species of neutrino? What is the energy density Ωr of relativistic particles today relative to the critical density? [Hint: Don’t forget to take into account the fact that the neutrino temperature today differs from the photon temperature.] 4. Evidence for neutrino oscillations from the MINOS experiment (2008, http://www-numi.fnal.gov/ PublicInfo/forscientists.html) indicates that at least one neutrino type has mass mν > ∼ 0.05 eV. At what redshift zν would such a neutrino become non-relativistic? If neutrinos are non-relativistic, what is P the neutrino density Ων relative to the critical density, in terms of the sum of the neutrino masses mν ? Which of the effective number of relativistic species today gρ,0 , gs,0 , and gn,0 is changed if some neutrinos are non-relativistic today? 5. Use entropy conservation to argue that the ratio of the photon temperature T at redshift z in the early Universe to the photon temperature T0 today is 1/3  gs,0 T = (1 + z) . (10.71) T0 gs ρ = gρ,0

What is gs in terms of the numbers gb and gf of relativistic boson and fermion types, if all species were at the same temperature T ? Solution. The ratio of neutrino to photon temperatures post e¯ e annihilation is  1/3  1/3 gγ 4 Tν . = = 7 Tγ 11 gγ + 8 ge The Cosmic Neutrino Background temperature is  1/3 4 Tν = 2.725 K = 1.945 K . 11

(10.72)

(10.73)

With 2 bosonic degrees of freedom from photons, and 6 fermionic degrees of freedom from 3 relativistic

156

Homogeneous, Isotropic Cosmology

neutrino types, the effective energy-, entropy-, and number-weighted number of relativistic degrees of freedom is  4/3  4 7 7 4 Tν gν = 2 + 6 = 3.36 , (10.74a) gρ,0 = gγ + Tγ 8 11 8  3 Tν 4 7 43 7 gs,0 = gγ + gν = 2 + 6= = 3.91 , (10.74b) Tγ 8 11 8 11  3 Tν 4 3 40 3 gn,0 = gγ + gν = 2 + 6= = 3.64 . (10.74c) Tγ 4 11 4 11 The redshift at which a neutrino of mass mν becomes non-relativistic is   m mν ν 1 + zν = . = 300 Tν 0.05 eV

(10.75)

If some neutrinos are non-relativistic, then the neutrino density Ων today is related to the sum neutrino masses by P  −2 P h mν 8πG mν nν −4 . = 5.3 × 10 Ων = 3H02 0.05 eV 0.71

P

mν of

(10.76)

gρ,0 is changed if some neutrinos are non-relativistic today, but gs,0 and gn,0 remain unchanged. If just one of the neutrino types is massive, and the other two are relativistic, then  4/3 7 4 gρ,0 = 2 + 4 = 2.91 . (10.77) 11 8 The radiation density Ωr today, including photons and neutrinos, is g  8πGρr ρ,0 Ωr = 2 2 = 1.236 × 10−5 gρ,0 h−2 = 8.2 × 10−5 3c H0 3.36



h 0.71

−2

.

(10.78)

If the temperatures of all species are equal, then the entropy-weighted effective number of relativistic species is 7 gs = gb + gf . (10.79) 8

PART FIVE TETRAD APPROACH TO GENERAL RELATIVITY

Concept Questions

1. The vierbein has 16 degrees of freedom instead of the 10 degrees of freedom of the metric. What do the extra 6 degrees of freedom correspond to? 2. Tetrad transformations are defined to be Lorentz transformations. Don’t general coordinate transformations already include Lorentz transformations as a particular case, so aren’t tetrad transformations redundant? 3. What does coordinate gauge-invariant mean? What does tetrad gauge-invariant mean? 4. Is the coordinate metric gµν tetrad gauge-invariant? 5. What does a directed derivative ∂m mean physically? 6. Is the directed derivative ∂m coordinate gauge-invariant? 7. Is the tetrad metric γmn coordinate gauge-invariant? Is it tetrad gauge-invariant? 8. What is the tetrad-frame 4-velocity um of a person at rest in an orthonormal tetrad frame? 9. If the tetrad frame is accelerating (not in free-fall), which of the following is true/false? a. Does the tetrad-frame 4-velocity um of a person continuously at rest in the tetrad frame change with time? ∂0 um = 0? D0 um = 0? b. Do the tetrad axes γm change with time? ∂0 γm = 0? D0 γm = 0? c. Does the tetrad metric γmn change with time? ∂0 γmn = 0? D0 γmn = 0? d. Do the covariant components um of the 4-velocity of a person continuously at rest in the tetrad frame change with time? ∂0 um = 0? D0 um = 0? 10. Suppose that p = γm pm is a 4-vector. Is the proper rate of change of the proper components pm measured by an observer equal to the directed time derivative ∂0 pm or to the covariant time derivative D0 pm ? What about the covariant components pm of the 4-vector? [Hint: The proper contravariant components of the 4-vector measured by an observer are pm ≡ γ m · p where γ m are the contravariant locally inertial rest axes of the observer. Similarly the proper covariant components are pm ≡ γm · p.] 11. A person with two eyes separated by proper distance δξ n observes an object. The observer observes the photon 4-vector from the object to be pm . The observer uses the difference δpm in the two 4-vectors detected by the two eyes to infer the binocular distance to the object. Is the difference δpm in photon

160

12.

13. 14. 15. 16.

Concept Questions

4-vectors detected by the two eyes equal to the directed derivative δξ n ∂n pm or to the covariant derivative δξ n Dn pm ? Suppose that pm is a tetrad 4-vector. Parallel-transport the 4-vector by an infinitesimal proper distance δξ n . Is the change in pm measured by an ensemble of observers at rest in the tetrad frame equal to the directed derivative δξ n ∂n pm or to the covariant derivative δξ n Dn pm ? [Hint: What if “rest” means that the observer at each point is separately at rest in the tetrad frame at that point? What if “rest” means that the observers are mutually at rest relative to each other in the rest frame of the tetrad at one particular point?] What is the physical significance of the fact that directed derivatives fail to commute? Physically, what do the tetrad connection coefficients Γkmn mean? What is the physical significance of the fact that Γkmn is antisymmetric in its first two indices (if the tetrad metric γmn is constant)? Are the tetrad connections Γkmn coordinate gauge-invariant?

What’s important?

This part of the notes describes the tetrad formalism of GR. 1. Why tetrads? Because physics is clearer in a locally inertial frame than in a coordinate frame. 2. The primitive object in the tetrad formalism is the vierbein em µ , in place of the metric in the coordinate formalism. 3. Written suitably, for example as equation (11.9), a metric ds2 encodes not only the metric coefficients gµν , but a full (inverse) vierbein em µ , through ds2 = γmn em µ dxµ en ν dxν . 4. The tetrad road from vierbein to energy-momentum is similar to the coordinate road from metric to energy-momentum, albeit a little more complicated. 5. In the tetrad formalism, the directed derivative ∂m is the analog of the coordinate partial derivative ∂/∂xµ of the coordinate formalism. Directed derivatives ∂m do not commute, whereas coordinate derivatives ∂/∂xµ do commute.

11 The tetrad formalism

11.1 Tetrad A tetrad (greek foursome) γm (x) is a set of axes γm ≡ {γ γ0 , γ 1 , γ 2 , γ 3 }

(11.1)

µ

attached to each point x of spacetime. The common case is that of an orthonormal tetrad, where the axes form a locally inertial frame at each point, so that the scalar products of the axes constitute the Minkowski metric ηmn γm · γn = ηmn .

(11.2)

However, other tetrads prove useful in appropriate circumstances. There are spinor tetrads, null tetrads (notably the Newman-Penrose double null tetrad), and others (indeed, the basis of coordinate tangent vectors gµ is itself a tetrad). In general, the tetrad metric is some symmetric matrix γmn γm · γn ≡ γmn .

(11.3)

Andrew’s convention: latin (black) dummy indices label tetrad frames. greek (brown) dummy indices label coordinate frames. Why introduce tetrads? 1. The physics is more transparent when expressed in a locally inertial frame (or some other frame adapted to the physics), as opposed to the coordinate frame, where Salvador Dali rules. 2. If you want to consider spin- 12 particles and quantum physics, you better work with tetrads. 3. For good reason, much of the GR literature works with tetrads, so it’s useful to understand them.

11.2 Vierbein The vierbein (German four-legs, or colloquially, critter) em µ is defined to be the matrix that transforms between the tetrad frame and the coordinate frame (note the placement of indices: the tetrad index m comes

11.3 The metric encodes the vierbein

163

first, then the coordinate index µ) γm = em µ gµ .

(11.4)

The vierbein is a 4 × 4 matrix, with 16 independent components. The inverse vierbein em µ is defined to be the matrix inverse of the vierbein em µ , so that em µ em ν = δµν ,

n e m µ e n µ = δm .

(11.5)

Thus equation (11.4) inverts to gµ = em µ γm .

(11.6)

11.3 The metric encodes the vierbein The scalar spacetime distance is ds2 = gµν dxµ dxν = gµ · gν dxµ dxν = γmn em µ en ν dxµ dxν

(11.7)

from which it follows that the coordinate metric gµν is gµν = γmn em µ en ν .

(11.8)

The shorthand way in which metric’s are commonly written encodes not only a metric but also an inverse vierbein, hence a tetrad. For example, the Schwarzschild metric    −1 2M 2M ds2 = − 1 − dt2 + 1 − dr2 + r2 dθ2 + r2 sin2 θ dφ2 (11.9) r r takes the form (11.7) with an orthonormal (Minkowski) tetrad metric γmn = ηmn , and an inverse vierbein encoded in the differentials  1/2 2M 0 µ e µ dx = 1 − dt , (11.10a) r −1/2  2M 1 µ dr , (11.10b) e µ dx = 1 − r e2 µ dxµ = r dθ , 3

µ

e µ dx = r sin θ dφ , Explicitly, the inverse vierbein of the Schwarzschild metric is the diagonal matrix   (1 − 2M/r)1/2 0 0 0   0 (1 − 2M/r)−1/2 0 0  , em µ =    0 0 r 0 0 0 0 r sin θ

(11.10c)

(11.10d)

(11.11)

164

The tetrad formalism

and the corresponding vierbein is (note that, because the tetrad index is always in the first place and the coordinate index is always in the second place, the matrices as written are actually inverse transposes of each other, not just inverses)   (1 − 2M/r)−1/2 0 0 0   0 (1 − 2M/r)1/2 0 0  . em µ =  (11.12)   0 0 1/r 0 0 0 0 1/(r sin θ)

Concept question 11.1 Schwarzschild vierbein. The components e0 t and e1 r of the Schwarzschild vierbein (11.12) are imaginary inside the horizon. What does this mean? Is the vierbein still valid inside the horizon? ⋄

11.4 Tetrad transformations Tetrad transformations are transformations that preserve the fundamental property of interest, for example the orthonormality, of the tetrad. For all tetrads of interest in these notes, which includes not only orthonormal tetrads, but also spinor tetrads and null tetrads (but not coordinate-based tetrads), tetrad transformations are Lorentz transformations. Hereafter these notes will presume that a tetrad transformation is a Lorentz transformation. The Lorentz transformation may be, and usually is, a different transformation at each point. Tetrad transformations rotate the tetrad axes γk at each point by a Lorentz transformation Lk m , while keeping the background coordinates xµ unchanged: γk → γk′ = Lk m γm .

(11.13)

In the case that the tetrad axes γk are orthonormal, with a Minkowski metric, the Lorentz transformation matrices Lk m in equation (11.13) take the familiar special relativistic form, but the linear matrices Lk m in equation (11.13) signify a Lorentz transformation in any case. In all the cases of interest, including orthonormal, spinor, and null tetrads, the tetrad metric γmn is constant. Lorentz transformations are precisely those transformations that leave the tetrad metric unchanged ′ γkl = γk′ · γl′ = Lk m Ll n γm · γn = Lk m Ll n γmn = γkl .

(11.14)

Exercise 11.2 Generators of Lorentz transformations are antisymmetric. From the condition that the tetrad metric γmn is unchanged by a Lorentz transformation, show that the generator of an infinitesimal Lorentz transformation is an antisymmetric matrix. Is this true only for an orthonormal tetrad, or is it true more generally? Solution. An infinitesimal Lorentz transformation is the sum of the unit matrix and an infinitesimal piece ∆Lk m , the generator of the infinitesimal Lorentz transformation, Lk m = δkm + ∆Lk m .

(11.15)

11.5 Tetrad Tensor

165

Under such an infinitesimal Lorentz transformation, the tetrad metric transforms to ′ γkl = (δkm + ∆Lk m )(δln + ∆Ll n )γmn ≈ γkl + ∆Lkl + ∆Llk ,

(11.16)

which by proposition equals the original tetrad metric γkl , equation (11.14). It follows that ∆Lkl + ∆Llk = 0 , that is, the generator ∆Lkl is antisymmetric, as claimed.

(11.17) ⋄

11.5 Tetrad Tensor In general, a tetrad-frame tensor Akl... mn... is an object that transforms under tetrad (Lorentz) transformations (11.13) as k l c d ab... A′kl... mn... = L a L b ... Lm Ln ... Acd... .

(11.18)

11.6 Raising and lowering indices In the coordinate approach to GR, coordinate indices were lowered and raised with the coordinate metric gµν and its inverse g µν . In the tetrad formalism there are two kinds of indices, tetrad indices and coordinate indices, and they flip around as follows: 1. Lower and raise coordinate indices with the coordinate metric gµν and its inverse g µν ; 2. Lower and raise tetrad indices with the tetrad metric γmn and its inverse γ mn ; 3. Switch between coordinate and tetrad frames with the vierbein em µ and its inverse em µ . The kinds of objects for which this flippery is valid are called tensors. Tensors with only tetrad indices, such as the tetrad axes γm or the tetrad metric γmn , are called tetrad tensors, and they remain unchanged under coordinate transformations. Tensors with only coordinate indices, such as the coordinate tangent axes gµ or the coordinate metric gµν , are called coordinate tensors, and they remain unchanged under tetrad transformations. Tensors may also be mixed, such as the vierbein em µ . And of course just because something has an index, greek or latin, does not make it a tensor: a tensor is a tensor if any only if it transforms like a tensor.

11.7 Gauge transformations Gauge transformations are transformations of the coordinates or tetrad. Such transformations do not change the underlying spacetime. Quantities that are unchanged by a coordinate transformation are coordinate gauge-invariant. Quantities that are unchanged under a tetrad transformation are tetrad gauge-invariant. For example, tetrad tensors are coordinate gauge-invariant, while coordinate tensors are tetrad gauge-invariant.

166

The tetrad formalism

Tetrad transformations have the 6 degrees of freedom of Lorentz transformations, with 3 degrees of freedom in spatial rotations, and 3 more in Lorentz boosts. General coordinate transformations have 4 degrees of freedom. Thus there are 10 degrees of freedom in the choice of tetrad and coordinate system. The 16 degrees of freedom of the vierbein, minus the 10 degrees of freedom from the transformations of the tetrad and coordinates, leave 6 physical degrees of freedom in spacetime, the same as in the coordinate approach to GR, which is as it should be.

11.8 Directed derivatives Directed derivatives ∂m are defined to be the directional derivatives along the axes γm ∂m ≡ γm · ∂ = γm · g µ

∂ ∂ = em µ µ ∂xµ ∂x

is a tetrad 4-vector .

(11.19)

The directed derivative ∂m is independent of the choice of coordinates, as signalled by the fact that it has only a tetrad index, no coordinate index. Unlike coordinate derivatives ∂/∂xµ , directed derivatives ∂m do not commute. Their commutator is   µ ∂ ν ∂ [∂m , ∂n ] = em , en ∂xµ ∂xν ν ∂en ∂ ∂em µ ∂ − en ν = em µ µ ν ∂x ∂x ∂xν ∂xµ k k = (dnm − dmn ) ∂k is not a tetrad tensor (11.20) where dlmn ≡ γlk dkmn is the vierbein derivative dlmn ≡ γlk ek κ en ν

∂em κ ∂xν

is not a tetrad tensor .

(11.21)

Since the vierbein and inverse vierbein are inverse to each other, an equivalent definition of dlmn in terms of the inverse vierbein is ∂ek µ is not a tetrad tensor . (11.22) dlmn ≡ − γlk em µ en ν ∂xν

11.9 Tetrad covariant derivative The derivation of tetrad covariant derivatives Dm follows precisely the analogous derivation of coordinate covariant derivatives Dµ . The tetrad-frame formulae look entirely similar to the coordinate-frame formulae, with the replacement of coordinate partial derivatives by directed derivatives, ∂/∂xµ → ∂m , and the replacement of coordinate-frame connections by tetrad-frame connections Γκµν → Γkmn . There are two things to be careful about: first, unlike coordinate partial derivatives, directed derivatives ∂m do not commute;

11.9 Tetrad covariant derivative

167

and second, neither tetrad-frame nor coordinate-frame connections are tensors, and therefore it should be no surprise that the tetrad-frame connections Γlmn are not related to the coordinate-frame connections Γλµν by the ‘usual’ vierbein transformations. Rather, the tetrad and coordinate connections are related by equation (11.32). If Φ is a scalar, then ∂m Φ is a tetrad 4-vector. The tetrad covariant derivative of a scalar is just the directed derivative Dm Φ = ∂m Φ

is a tetrad 4-vector .

(11.23)

If Am is a tetrad 4-vector, then ∂n Am is not a tetrad tensor, and ∂n Am is not a tetrad tensor. But the 4-vector A = γm Am , being by construction invariant under both tetrad and coordinate transformations, is a scalar, and its directed derivative is therefore a 4-vector ∂n A = ∂n (γ γm Am ) is a tetrad 4-vector = γm ∂n Am + (∂n γm )Am = γm ∂n Am + Γkmn γk Am

(11.24)

where the tetrad-frame connection coefficients, Γkmn , also known as Ricci rotation coefficients (or, in the context of Newman-Penrose tetrads, spin coefficients) are defined by ∂n γm ≡ Γkmn γk

is not a tetrad tensor .

(11.25)

Equation (11.24) shows that ∂n A = γk (Dn Ak )

is a tetrad tensor

(11.26)

where Dn Ak is the covariant derivative of the contravariant 4-vector Ak Dn Ak ≡ ∂n Ak + Γkmn Am

is a tetrad tensor .

(11.27)

Similarly, ∂n A = γ k (Dn Ak )

(11.28)

where Dn Ak is the covariant derivative of the covariant 4-vector Ak Dn Ak ≡ ∂n Ak − Γm kn Am

is a tetrad tensor .

(11.29)

In general, the covariant derivative of a tensor is kl... k bl... l kb... b kl... b kl... Da Akl... mn... = ∂a Amn... + Γba Amn... + Γba Amn... + ... − Γma Abn... − Γna Amb... − ...

with a positive Γ term for each contravariant index, and a negative Γ term for each covariant index.

(11.30)

168

The tetrad formalism

11.10 Relation between tetrad and coordinate connections The relation between the tetrad connections Γkmn and their coordinate counterparts Γκµν follows from ∂em κ gκ is not a tetrad tensor ∂xν ∂em κ ∂gκ gκ + en ν em κ ν = en ν ∂xν ∂x = dkmn ekκ gκ + en ν em κ Γλκν gλ .

Γkmn γk = ∂n γm = en ν

(11.31)

Thus the relation is Γlmn − dlmn = el λ em µ en ν Γλµν

is not a tetrad tensor

(11.32)

where Γlmn ≡ γlk Γkmn .

(11.33)

11.11 Torsion tensor m The torsion tensor Skl , which GR assumes to vanish, is defined in the usual way by the commutator of the covariant derivative acting on a scalar Φ m [Dk , Dl ] Φ = Skl ∂m Φ

is a tetrad tensor .

(11.34)

The expression (11.29) for the covariant derivatives coupled with the commutator (11.20) of directed derivatives shows that the torsion tensor is m m m m Skl = Γm kl − Γlk − dkl + dlk

is a tetrad tensor

(11.35)

m where dm kl are the vierbein derivatives defined by equation (11.21). The torsion tensor Skl is antisymmetric in k ↔ l, as is evident from its definition (11.34).

11.12 No-torsion condition GR assumes vanishing torsion. Then equation (11.35) implies the no-torsion condition Γmkl − dmkl = Γmlk − dmlk

is not a tetrad tensor .

(11.36)

In view of the relation (11.32) between tetrad and coordinate connections, the no-torsion condition (11.36) is equivalent to the usual symmetry condition Γµκλ = Γµλκ on the coordinate frame connections, as it should be.

11.13 Antisymmetry of the connection coefficients

169

11.13 Antisymmetry of the connection coefficients The directed derivative of the tetrad metric is ∂n γlm = ∂n (γ γl · γ m )

= γl · ∂n γm + γm · ∂n γl = Γlmn + Γmln .

(11.37)

In most cases of interest, including orthonormal, spinor, and null tetrads, the tetrad metric is chosen to be a constant. For example, if the tetrad is orthonormal, then the tetrad metric is the Minkowski metric, which is constant, the same everywhere. If the tetrad metric is constant, then all derivatives of the tetrad metric vanish, and then equation (11.37) shows that the tetrad connections are antisymmetric in their first two indices Γlmn = −Γmln .

(11.38)

This antisymmetry reflects the fact that Γlmn is the generator of a Lorentz transformation for each n.

11.14 Connection coefficients in terms of the vierbein In the general case of non-constant tetrad metric, and non-vanishing torsion, the following manipulation ∂n γlm + ∂m γln − ∂l γmn = Γlmn + Γmln + Γlnm + Γnlm − Γmnl − Γnml

(11.39)

= 2 Γlmn − Slmn − Smnl − Snml − dlmn + dlnm − dmnl + dmln − dnml + dnlm

implies that the tetrad connections Γlmn are given in terms of the derivatives ∂n γlm of the tetrad metric, the torsion Slmn , and the vierbein derivatives dlmn by Γlmn =

1 2

(∂n γlm + ∂m γln − ∂l γmn + Slmn + Smnl + Snml + dlmn − dlnm + dmnl − dmln + dnml − dnlm )

is not a tetrad tensor .

(11.40)

If torsion vanishes, as GR assumes, and if furthermore the tetrad metric is constant, then equation (11.40) simplifies to the following expression for the tetrad connections in terms of the vierbein derivatives dlmn defined by (11.21) Γlmn =

1 2

(dlmn − dlnm + dmnl − dmln + dnml − dnlm )

is not a tetrad tensor .

This is the formula that allows connection coefficients to be calculated from the vierbein.

(11.41)

170

The tetrad formalism

11.15 Riemann curvature tensor The Riemann curvature tensor Rklmn is defined in the usual way by the commutator of the covariant derivative acting on a contravariant 4-vector. In the presence of torsion, n [Dk , Dl ] Am ≡ Skl Dn Am + Rklmn An

is a tetrad tensor .

(11.42)

If torsion vanishes, as GR assumes, then the definition (11.42) reduces to [Dk , Dl ] Am ≡ Rklmn An

is a tetrad tensor .

(11.43)

The expression (11.29) for the covariant derivative coupled with the torsion equation (11.34) yields the following formula for the Riemann tensor in terms of connection coefficients, for the general case of nonvanishing torsion: a )Γmna Rklmn = ∂k Γmnl − ∂l Γmnk + Γaml Γank − Γamk Γanl + (Γakl − Γalk − Skl

is a tetrad tensor .

(11.44)

a The formula has extra terms (Γakl − Γalk − Skl )Γmna compared to the usual formula for the coordinate-frame Riemann tensor Rκλµν . If torsion vanishes, as GR assumes, then

Rklmn = ∂k Γmnl − ∂l Γmnk + Γaml Γank − Γamk Γanl + (Γakl − Γalk )Γmna

is a tetrad tensor .

(11.45)

The symmetries of the tetrad-frame Riemann tensor are the same as those of the coordinate-frame Riemann tensor. For vanishing torsion, these are R([kl][mn]) ,

(11.46)

Rklmn + Rknlm + Rkmnl = 0 .

(11.47)

Exercise 11.3 Riemann tensor. From the definition (11.42), derive the expression (11.44) for the Riemann tensor. Show that, in addition to the antisymmetry in kl which follows immediately from the definition (11.42), the Riemann tensor Rklmn is antisymmetric in the indices mn. [Hint: Start by expanding out the definition (11.42) using the definition (11.30) of the covariant derivative. You will find it easier to derive an expression for the Riemann tensor with one index raised, such as Rklm n , but you should resist the temptation to leave it there, because the symmetries of the Riemann tensor are obscured when one index is raised. To switch to all lowered indices, you will need to convert terms such as ∂k Γnml by ∂k Γnml = ∂k (γ np Γpml ) = γ np ∂k Γpml + Γpml ∂k γ np .

(11.48)

You should show that the directed derivative ∂k γ np in this expression is related to tetrad connections through a formula similar to equation (11.37) ∂k γ np = − Γnp k − Γpn k ,

(11.49)

which you should recognize as equivalent to Dk γ np = 0. To complete the derivation, show that a ∂k (Γmnl + Γnml ) − ∂l (Γmnk + Γnmk ) = [∂k , ∂l ]γmn = (Γalk − Γakl + Skl )(Γmna + Γnma ) .

The antisymmetry of Rklmn in mn follows from equation (11.50).]

(11.50) ⋄

11.16 Ricci, Einstein, Bianchi

171

11.16 Ricci, Einstein, Bianchi The usual suite of formulae leading to Einstein’s equations apply. Since all the quantities are tensors, and all the equations are tensor equations, their form follows immediately from their coordinate counterparts. Ricci tensor: Rkm ≡ γ ln Rklmn .

(11.51)

R ≡ γ km Rkm .

(11.52)

Ricci scalar:

Einstein tensor: Gkm ≡ Rkm −

1 2

γkm R .

(11.53)

Einstein’s equations: Gkm = 8πGTkm .

(11.54)

Dk Rlmnp + Dl Rmknp + Dm Rklnp = 0 ,

(11.55)

Bianchi identities:

which most importantly imply covariant conservation of the Einstein tensor, hence conservation of energymomentum Dk Tkm = 0 .

(11.56)

11.17 Electromagnetism 11.17.1 Electromagnetic potential and field The electromagnetic field is derivable from an electromagnetic 4-potential Am , Am = (φ, A) .

(11.57)

The electromagnetic field is a bivector field (an antisymmetric tensor) Fmn , Fmn ≡ Dn Am − Dm An .

(11.58)

If torsion vanishes, as general relativity assumes, then the covariant derivatives in the definition (11.58) can be replaced by directed derivatives, Fmn = ∂n Am − ∂m An .

(11.59)

If the tetrad is orthonormal, then the traditional description of the electromagnetic field in terms of electric and magnetic fields E and B is valid: E = − ∂t A − ∇φ ,

B =∇×A ,

(11.60)

172

The tetrad formalism

where ∇ ≡ {∂x , ∂y , ∂z } denotes the spatial tetrad directed derivative 3-vector. In an orthonormal tetrad {γ γt , γx , γy , γz }, the 6 components of the electromagnetic field Fmn are related to the electric and magnetic fields E = {Ex , Ey , Ez } and B = {Bx , By , Bz } by     0 −Ex −Ey −Ez 0 Ex Ey Ez   −Ex 0 −Bz By  0 −Bz By   .  , F mn =  Ex (11.61) Fmn =     −Ey Bz Ey Bz 0 −Bx  0 −Bx Ez −By Bx 0 −Ez −By Bx 0

11.17.2 Lorentz force law In the presence of an electromagnetic field F mn , the general relativistic equation of motion for the 4-velocity um ≡ dxm /dτ of a particle of mass m and charge q is modified by the addition of a Lorentz force qFn m un m

Dum = qFn m un . Dτ

(11.62)

In the absence of gravitational fields, so D/Dτ = d/dτ , and with um = ut {1, v} where v is the 3-velocity, the spatial components of equation (11.62) reduce to [note that d/dt = (1/ut ) d/dτ ] m

dut v = q (E + v × B) dt

(11.63)

which is the classical special relativistic Lorentz force law. The signs in the expression (11.61) for Fmn in terms of E = {Ex , Ey , Ez } and B = {Bx , By , Bz } are arranged to agree with the classical law (11.63).

11.17.3 Maxwell’s equations The source-free Maxwell’s equations are Dl Fmn + Dm Fnl + Dn Flm = 0 ,

(11.64)

while the sourced Maxwell’s equations are Dm F mn = 4πj n ,

(11.65)

where j n is the electric 4-current. The sourced Maxwell’s equations (11.65) coupled with the antisymmetry of the electromagnetic field tensor Fmn ensure conservation of electric charge Dn j n = 0 .

(11.66)

In flat space with a Minkowski metric, the covariant derivatives simplify to ordinary derivatives, Dm → ∂m → ∂/∂xm , and the source-free Maxwell’s equations (11.64) reduce to the traditional form ∇·B =0 ,

∇×E+

∂B =0, ∂t

(11.67)

11.17 Electromagnetism

173

while the sourced Maxwell’s equations reduce to ∂E = 4πj , (11.68) ∂t where the electric charge density q and the electric current density j are the time and space components of the electric 4-current ∇ · E = 4πq ,

∇×B−

j n = {q, j} .

(11.69)

11.17.4 Electromagnetic energy-momentum tensor The energy-momentum tensor Temn of an electromagnetic field Fmn is   1 1 F m k F nk − γ mn Fkl F kl . Temn = 4π 4

(11.70)

12 ∗

More on the tetrad formalism

This chapter presents some more advanced aspects of the tetrad formalism. It discusses spinor tetrads (§12.1) and Newman-Penrose tetrads (§12.2). The chapter goes on to show how the fields that describe electromagnetic (§12.3) and gravitational (§12.4) waves have a natural and insightful complex structure that is brought out in a Newman-Penrose tetrad. The Newman-Penrose formalism provides a natural context for the Petrov classification of the Weyl tensor (§12.5), and for the derivation of the Raychaudhuri equations (§12.6) which imply the focussing theorem (§12.7) that is a key ingredient of the Penrose-Hawking singularity theorems.

12.1 Spinor tetrad formalism In quantum mechanics, fundamental particles have spin. The 3 generations of leptons (electrons, muons, tauons, and their respective neutrino partners) and quarks (up, strange, top, and their down, charm, and bottom partners) have spin 21 (in units ~ = 1). The carrier particles of the electromagnetic force (photons), the weak force (the W ± and Z bosons), and the colour force (the 8 gluons), have spin 1. The carrier of the gravitational force, the graviton, is expected to have spin 2, though as of 2010 no gravitational wave, let alone its quantum, the graviton, has been detected. General relativity is a classical, not quantum, theory. Nevertheless the spin properties of classical waves, such as electromagnetic or gravitational waves, are already apparent classically.

12.1.1 Spinor tetrad A systematic way to project objects into spin components is to work in a spinor tetrad. As will become apparent below, equation (12.5), spin describes how an object transforms under rotation about some preferred axis. In the case of an electromagnetic or gravitational wave, the natural preferred axis is the direction of propagation of the wave. With respect to the direction of propagation, electromagnetic waves prove to have two possible spins, or helicities, ±1, while gravitational waves have two possible spins, or helicities, ±2. A preferred axis might also be set by an experimenter who chooses to measure spin along some particular

12.1 Spinor tetrad formalism

175

direction. The following treatment takes the preferred direction to lie along the z-axis γz , but there is no loss of generality in making this choice. Start with an orthonormal tetrad {γ γt , γx , γy , γz }. If the preferred tetrad axis is the z-axis γz , then the spinor tetrad axes {γ γ+ , γ− } are defined to be complex combinations of the transverse axes {γ γx , γy }, γ+ ≡

√1 (γ γx 2

+ iγ γy ) ,

(12.1a)

γ− ≡

√1 (γ γx 2

− iγ γy ) .

(12.1b)

The tetrad metric of the spinor tetrad {γ γt , γz , γ+ , γ− } is  −1 0  0 1 γmn =   0 0 0 0

0 0 0 1

 0 0   . 1  0

(12.2)

Notice that the spinor axes {γ γ+ , γ− } are themselves null, γ+ · γ+ = γ− · γ− = 0, whereas their scalar product with each other is non-zero γ+ · γ− = 1. The null character of the spinor axes is what makes spin especially well-suited to describing fields, such as electromagnetism and gravity, that propagate at the speed of light. An even better trick in dealing with fields that propagate at the speed of light is to work in a Newman-Penrose tetrad, §12.2, in which all 4 tetrad axes are taken to be null.

12.1.2 Transformation of spin under rotation about the preferred axis Under a right-handed rotation by angle χ about the preferred axis γz , the transverse axes γx and γy transform as SIGN?! γx → cos χ γx − sin χ γy ,

γy → sin χ γx + cos χ γy .

(12.3)

It follows that the spinor axes γ+ and γ− transform under a right-handed rotation by angle χ about γz as γ± → e±iχ γ± .

(12.4)

The transformation (12.4) identifies the spinor axes γ+ and γ− as having spin +1 and −1 respectively.

12.1.3 Spin More generally, an object can be defined as having spin s if it varies by esiχ

(12.5)

under a right-handed rotation by angle χ about the preferred axis γz . Thus an object of spin s is unchanged by a rotation of 2π/s about the preferred axis. A spin-0 object is symmetric about the γz axis, unchanged by a rotation of any angle about the axis. The γz axis itself is spin-0, as is the time axis γt .

176



More on the tetrad formalism

The components of a tensor in a spinor tetrad inherit spin properties from that of the spinor basis. The general rule is that the spin s of any tensor component is equal to the number of + covariant indices minus the number of − covariant indices: spin s = number of + minus − covariant indices .

(12.6)

12.1.4 Spin flip Under a reflection through the y-axis, the spinor axes swap: γ+ ↔ γ− ,

(12.7)

which may also be accomplished by complex conjugation. Reflection through the y-axis, or equivalently complex conjugation, changes the sign of all spinor indices of a tensor component +↔−.

(12.8)

In short, complex conjugation flips spin, a pretty feature of the spinor formalism.

12.1.5 Spin versus spherical harmonics In physical problems, such as in cosmological perturbations, or in perturbations of spherical black holes, or in the hydrogen atom, spin often appears in conjunction with an expansion in spherical harmonics. Spin should not be confused with spherical harmonics. Spin and spherical harmonics appear together whenever the problem at hand has a symmetry under the 3D special orthogonal group SO(3) of spatial rotations (special means of unit determinant; the full orthogonal group O(3) contains in addition the discrete transformation corresponding to reflection of one of the axes, which flips the sign of the determinant). Rotations in SO(3) are described by 3 Euler angles {θ, φ, χ}. Spin is associated with the Euler angle χ. The usual spherical harmonics Yℓm (θ, φ) are the spin-0 eigenfunctions of SO(3). The eigenfunctions of the full SO(3) group are the spin harmonics SIGN? sYℓm (θ, φ, χ)

= Θℓms (θ, φ, χ)eimφ eisχ .

(12.9)

12.1.6 Spinor components of the Einstein tensor With respect to a spinor tetrad, the components of the Einstein tensor Gmn are   Gtt Gtz Gt+ Gt−  Gtz Gzz Gz+ Gz−   Gmn =   Gt+ Gz+ G++ G+−  . Gt− Gz− G+− G−−

(12.10)

12.2 Newman-Penrose tetrad formalism

177

From this it is apparent that the 10 components of the Einstein tensor decompose into 4 spin-0 components, 4 spin-±1 components, and 2 spin-±2 components: −2 : −1 : 0: +1 : +2 :

G−− , Gt− , Gz− , Gtt , Gtz , Gzz , G+− , Gt+ , Gz+ , G++ .

(12.11)

The 4 spin-0 components are all real; in particular G+− is real since G∗+− = G−+ = G+− . The 4 spin-±1 and 2 spin-±2 components comprise 3 complex components G∗++ = G−− ,

G∗t+ = Gt− ,

G∗z+ = Gz− .

(12.12)

In some contexts, for example in cosmological perturbation theory, REALLY? the various spin components are commonly referred to as scalar (spin-0), vector (spin-±1), and tensor (spin-±2).

12.2 Newman-Penrose tetrad formalism The Newman-Penrose formalism (E. T. Newman & R. Penrose, 1962, “An Approach to Gravitational Radiation by a Method of Spin Coefficients,” J. Math. Phys. 3, 566–579; E. (Ted) Newman & R. Penrose, 2009, “Spin-coefficient formalism,” Scholarpedia, 4(6), 7445, http://www.scholarpedia.org/article/ Newman-Penrose_formalism) provides a particularly powerful way to deal with fields that propagate at the speed of light. The Newman-Penrose formalism adopts a tetrad in which the two axes γv (outgoing) and γu (ingoing) along the direction of propagation are chosen to be lightlike, while the two axes γ+ and γ− transverse to the direction of propagation are chosen to be spinor axes. Sadly, the literature on the Newman-Penrose formalism is characterized by an arcane and random notation whose principal purpose seems to be to perpetuate exclusivity for an old-boys club of people who understand it. This is unfortunate given the intrinsic power of the formalism. A. Held (1974, “A formalism for the investigation of algebraically special metrics. I,” Commun. Math. Phys. 37, 311–326) comments that the Newman-Penrose formalism presents “a formidable notational barrier to the uninitiate.” For example, the tetrad connections Γkmn are called “spin coefficients,” and assigned individual greek letters that obscure their transformation properties. Do not be fooled: all the standard tetrad formalism presented in Chapter 11 carries through unaltered. One ill-born child of the notation that persists in widespread use is ψ2−s for the spin s component of the Weyl tensor, equations (12.49). Gravitational waves are commonly characterized by the Newman-Penrose (NP) components of the Weyl tensor. The NP components of the Weyl tensor are sometimes referred to as the NP scalars. The designation as NP scalars is potentially misleading, because the NP components of the Weyl tensor form a tetrad-frame tensor, not a set of scalars (though of course the tetrad-frame Weyl tensor is, like any tetrad-frame quantity, a coordinate scalar). The NP components do become proper quantities, and in that sense scalars, when referred



178

More on the tetrad formalism

to the frame of a particular observer, such as a gravitational wave telescope, observing along a particular direction. However, the use of the word scalar to describe the components of a tensor is unfortunate.

12.2.1 Newman-Penrose tetrad A Newman-Penrose tetrad {γ γv , γu , γ+ , γ− } is defined in terms of an orthonormal tetrad {γ γt , γx , γy , γz } by γv ≡

√1 (γ γt 2

+ γz ) ,

(12.13a)

γu ≡

√1 (γ γt 2

− γz ) ,

(12.13b)

γ+ ≡

√1 (γ γx 2

+ iγ γy ) ,

(12.13c)

γ− ≡

√1 (γ γx 2

− iγ γy ) ,

(12.13d)

or in matrix form   1 0 γv  1 0  γu  1     γ+  = √2  0 1 0 1 γ− 

  0 1  0 −1  =   i 0 −i 0

 γt γx   . γy  γz

(12.14)

Just as each of the spinor axes γ+ and γ− individually specifies not one but two distinct directions γx and γy , so also each of the null axes γv and γv individually specifies not one but two distinct directions γt and γz . The ingoing null axis γu may be obtained from the outgoing null axis γv , and vice versa, by the parity operation of inverting all the spatial axes. All four tetrad axes are null γv · γv = γu · γu = γ+ · γ+ = γ− · γ− = 0 .

(12.15)

In a profound sense, the null, or lightlike, character of each the four NP axes explains why the NP formalism is well adapted to treating fields that propagate at the speed of light. The tetrad metric of the Newman-Penrose tetrad {γ γv , γu , γ+ , γ− } is

γmn

0  −1 =  0 0 

−1 0 0 0

0 0 0 1

 0 0   . 1  0

(12.16)

12.3 Electromagnetic field tensor

179

12.2.2 Boost weight A boost along the γz axis multiplies the outgoing and ingoing axes γv and γu by a blueshift factor ǫ and its reciprocal γv → ǫ γv ,

γu → (1/ǫ) γu .

(12.17)

If the observer boosts by velocity v in the γz direction away from the source, then the blueshift factor is the special relativistic Doppler shift factor  1/2 1−v . (12.18) ǫ= 1+v The exponent n of the power ǫn by which an object changes under a boost along the γz axis is called its boost weight. Thus γv has boost weight +1, and γu has boost weight −1. The spinor axes γ± both have boost weight 0. The NP components of a tensor inherit their boost weight properties from those of the NP basis. The general rule is that the boost weight n of any tensor component is equal to the number of v covariant indices minus the number of u covariant indices: boost weight n = number of v minus u covariant indices .

(12.19)

12.3 Electromagnetic field tensor 12.3.1 Complexified electromagnetic field tensor The electromagnetic field Fmn is a bivector, and as such has a natural complex structure. The real part of the electromagnetic bivector field is the electric field E, which changes sign under spatial inversion, while the imaginary part is the magnetic field B, which remains unchanged under spatial inversion. In an orthonormal tetrad {γ γt , γx , γy , γz }, the electromagnetic bivector can be written as though it were a vector with 6 components:    . (12.20) F = E B = Ex Ey Ez Bx By Bz = Ftx Fty Ftz Fzy Fxz Fyx The bivector has 3 electric and 3 magnetic components:

electric bivector indices: tx, ty, tz , magnetic bivector indices: zy, xz, yx .

(12.21)

The natural complex structure motivates defining the complexified electromagnetic field tensor F˜kl to be the complex combination   i 1 1 δkm δln + εkl mn Fmn is a tetrad tensor , (12.22) F˜kl ≡ (Fkl + ∗Fkl ) = 2 2 2

180



More on the tetrad formalism

where ∗Fkl denotes the Hodge dual of Fkl ∗

Fkl ≡

i εkl mn Fmn . 2

(12.23)

Here εklmn is the totally antisymmetric tensor (see Exercise 12.1). The overall factor of 21 on the right hand mn sides of equations (12.22) is introduced so that the complexification operator Pkl ≡ 12 (δkm δln + 2i εkl mn ) 2 is a projection operator, satisfying P = P . The definitions (12.22) and (12.23) of the complexified and dual electromagnetic field tensors F˜kl and ∗Fkl are valid in any frame, not just an orthonormal frame or a Newman-Penrose frame. In an orthonormal frame, the dual ∗F has the structure, in the same notation as (12.20),  ∗ F = i B −E . (12.24)

In an orthonormal frame the complexified electromagnetic field F˜ , equation (12.22), then has the structure  (12.25) F˜ = 12 1 −i (E + iB) . Thus the complexified electromagnetic field effectively embodies the electric and magnetic fields in the complex 3-vector combination E + iB. The complexified electromagnetic field is self-dual ∗˜

F = F˜ .

(12.26)

One advantage of this approach is that Maxwell’s equations (11.64) and (11.65) combine into a single complex equation Dm F˜ mn = 4πj n

(12.27)

whose real and imaginary parts represent respectively the source and source-free Maxwell’s equations. In some cases, such as the Kerr-Newman geometry expressed in the Boyer-Lindquist orthonormal tetrad, equation (15.33), the complexified electromagnetic field makes manifest the inner elegance of the geometry. Exercise 12.1 Totally antisymmetric tensor. In an orthonormal tetrad γm where γ0 points to the future and γ1 , γ2 , γ3 are right-handed, the contravariant totally antisymmetric tensor εklmn is defined by (this is the opposite sign from MTW’s notation) εklmn ≡ [klmn]

(12.28)

εklmn = −[klmn]

(12.29)

and hence

where [klmn] is the totally antisymmetric symbol   +1 if klmn is an even permutation of 0123 , [klmn] ≡ −1 if klmn is an odd permutation of 0123 ,  0 if klmn are not all different .

(12.30)

12.3 Electromagnetic field tensor

181

Argue that in a general basis gµ the contravariant totally antisymmetric tensor εκλµν is εκλµν = ek κ el λ em µ en ν εklmn = e [κλµν]

(12.31)

εκλµν = −(1/e) [κλµν]

(12.32)

while its covariant counterpart is

where e ≡ |em µ | is the determinant of the vierbein.



12.3.2 Newman-Penrose components of the electromagnetic field With respect to a NP null tetrad {γ γv , γu , γ+ , γ− }, equation (12.13), the electromagnetic field Fmn has 3 distinct complex components, here denoted φs , of spins respectively s = −1, 0, and +1 in accordance with the rule (12.6): −1 : φ−1 ≡ Fu− , 0: φ0 ≡ 12 (Fuv + F+− ) , +1 : φ1 ≡ Fv+ .

(12.33)

The complex conjugates φ∗s of the 3 NP components of the electromagnetic field are φ∗−1 = Fu+ , φ∗0 = 21 (Fuv − F+− ) , φ∗1 = Fv− ,

(12.34)

whose spins have the opposite sign, in accordance with the rule (12.8) that complex conjugation flips spin. The above convention that the index s on the NP component φs labels its spin differs from the standard convention, where the spin s component is capriciously denoted φ1−s (e.g. S. Chandrasekhar, 1983, The Mathematical Theory of Black Holes, Clarendon Press, Oxford, 1983): −1 : φ2 , 0 : φ1 , +1 : φ0 .

(standard convention, not followed here)

(12.35)

In terms of the electric and magnetic fields E and B in the parent orthonormal tetrad of the NP tetrad, the 3 complex NP components φs of the electromagnetic field are φ−1 = φ0 = φ1 =

1 2 1 2 1 2

[Ex + iBx − i(Ey + iBy )] , (Ez + iBz ) , [Ex + iBx + i(Ey + iBy )] .

(12.36)

Equations (12.36) show that the NP components of the electromagnetic field contain the electric and magnetic fields in the complex combination E + iB, just like the complexified electromagnetic field F˜kl , equa-

182



More on the tetrad formalism

tion (12.25). Explicitly, the NP components are related to the components F˜kl of the complexified electromagnetic field by φ−1 = F˜tx − iF˜ty , (12.37) φ0 = F˜tz , ˜ ˜ φ1 = Ftx + iFty . Part of the power of the NP formalism arises from the fact that it exploits the natural complex structure of the electromagnetic bivector field.

12.3.3 Newman-Penrose components of the complexified electromagnetic field The non-vanishing NP components of the complexified electromagnetic field F˜kl defined by equation (12.22) are F˜u− = φ−1 , ˜ (12.38) Fuv = F˜+− = φ0 , F˜v+ = φ1 , whereas components with bivector indices v− or u+ vanish, F˜v− = F˜u+ = 0 .

(12.39)

The rule that complex conjugation flips spin fails here because the complexification operator in equation (12.22) breaks the rule. Equations (12.38) and (12.39) show that the complexified electromagnetic field in an NP tetrad contains just 3 distinct non-vanishing complex components, and those components are precisely equal to the complex spin components φs .

12.3.4 Propagating components of electromagnetic waves An oscillating electric charge emits electromagnetic waves. Similarly, an electromagnetic wave incident on an electric charge causes it to oscillate. An electromagnetic wave moving away from a source is called outgoing, while a wave moving towards a source is called ingoing. It can be shown that only the spin −1 NP component φ−1 of an outgoing electromagnetic wave propagates, carrying electromagnetic energy to infinity: φ−1 : propagating, outgoing .

(12.40)

This propagating, outgoing −1 component has spin −1, but its complex conjugate has spin +1, so effectively both spin components, or helicities, or circular polarizations, of an outgoing electromagnetic wave are embodied in the single complex component φ−1 . The remaining 2 complex NP components φ0 and φ1 of an outgoing wave are short range, describing the electromagnetic field near the source. Similarly, only the spin +1 component φ1 of an ingoing electromagnetic wave propagates, carrying energy from infinity: φ1 : propagating, ingoing .

(12.41)

12.4 Weyl tensor

183

The isolation of each propagating mode into a single complex NP mode, incorporating both helicities, is simpler than the standard picture of oscillating orthogonal electric and magnetic fields.

12.4 Weyl tensor The Weyl tensor is the trace-free part of the Riemann tensor, Cklmn ≡ Rklmn −

1 2

(γkm Rln − γkn Rlm + γln Rkm − γlm Rkn ) +

1 6

(γkm γln − γkn γlm ) R .

(12.42)

By construction, the Weyl tensor vanishes when contracted on any pair of indices. Whereas the Ricci and Einstein tensors vanish identically in any region of spacetime containing no energy-momentum, Tmn = 0, the Weyl tensor can be non-vanishing. Physically, the Weyl tensor describes tidal forces and gravitational waves.

12.4.1 Complexified Weyl tensor The Weyl tensor is is, like the Riemann tensor, a symmetric matrix of bivectors. Just as the electromagnetic bivector Fkl has a natural complex structure, so also the Weyl tensor Cklmn has a natural complex structure. The properties of the Weyl tensor emerge most plainly when that complex structure is made manifest. In an orthonormal tetrad {γ γt , γx , γy , γz }, the Weyl tensor Cklmn can be written as a 6 × 6 symmetric bivector matrix, organized as a 2 × 2 matrix of 3 × 3 blocks, with the structure   Ctxtx Ctxty Ctxtz Ctxzy Ctxxz Ctxyx  C ... ... ...  ... ... tytx      C ... ... ... ... ...  CEE CEB   tztx (12.43) = C=  ,  Czytx CBE CBB ... ... ... ... ...     Cxztx ... ... ... ... ...  ... ... ... Cyxtx ... ...

where E denotes electric indices, B magnetic indices, per the designation (12.21). The condition of being ⊤ symmetric implies that the 3 × 3 blocks CEE and CBB are symmetric, while CBE = CEB . The cyclic symmetry (11.47) of the Riemann, hence Weyl, tensor implies that the off-diagonal 3 × 3 block CEB (and likewise CBE ) is traceless. The natural complex structure motivates defining a complexified Weyl tensor C˜klmn by    i 1 p q i r s pq rs ˜ δ δ + εkl δm δn + εmn Cpqrs is a tetrad tensor (12.44) Cklmn ≡ 4 k l 2 2 analogously to the definition (12.22) of the complexified electromagnetic field. The definition (12.44) of the complexified Weyl tensor C˜klmn is valid in any frame, not just an orthonormal frame. In an orthonormal



184

More on the tetrad formalism

frame, if the Weyl tensor Cklmn is organized according to the structure (12.43), then the complexified Weyl tensor C˜klmn defined by equation (12.44) has the structure   1 1 −i C˜ = (CEE − CBB + i CEB + i CBE ) . (12.45) −i −1 4 Thus the independent components of the complexified Weyl tensor C˜klmn constitute a 3 × 3 complex symmetric traceless matrix CEE − CBB + i(CEB + CBE ), with 5 complex degrees of freedom. Although the complexified Weyl tensor C˜klmn is defined, equation (12.44), as a projection of the Weyl tensor, it nevertheless retains all the 10 degrees of freedom of the original Weyl tensor Cklmn . The same complexification projection operator applied to the trace (Ricci) parts of the Riemann tensor yields only the Ricci scalar multiplied by that unique combination of the tetrad metric that has the symmetries of the Riemann tensor. Thus complexifying the trace parts of the Riemann tensor produces nothing useful.

12.4.2 Newman-Penrose components of the Weyl tensor With respect to a NP null tetrad {γ γv , γu , γ+ , γ− }, equation (12.13), the Weyl tensor Cklmn has 5 distinct complex components, here denoted ψs , of spins respectively s = −2, −1, 0, +1, and +2: −2 : ψ−2 −1 : ψ−1 0: ψ0 +1 : ψ1 +2 : ψ2

≡ ≡ ≡ ≡ ≡

Cu−u− , Cuvu− = C+−u− , 1 2 (Cuvuv + Cuv +− ) = Cvuv+ = C−+v+ , Cv+v+ .

1 2

(C+−+− + Cuv+− ) = Cv+−u ,

(12.46)

The complex conjugates ψs∗ of the 5 NP components of the Weyl tensor are: ∗ ψ− 2 ∗ ψ− 1 ψ0∗ ψ1∗ ψ2∗

= = = = =

Cu+u+ , Cuvu+ = C−+u+ , 1 2 (Cuvuv + Cuv −+ ) = Cvuv− = C+−v− , Cv−v− .

1 2

(C−+−+ + Cuv−+ ) = Cv−+u ,

(12.47)

whose spins have the opposite sign, in accordance with the rule (12.8) that complex conjugation flips spin. The above expressions (12.46) and (12.47) account for all the NP components Cklmn of the Weyl tensor but four, which vanish identically: Cv+v− = Cu+u− = Cv+u+ = Cv−u− = 0 .

(12.48)

The above convention that the index s on the NP component ψs labels its spin differs from the standard convention, where the spin s component of the Weyl tensor is impenetrably denoted ψ2−s (e.g. S. Chandrasekhar

12.4 Weyl tensor

185

1983): −2 : −1 : 0: +1 : +2 :

ψ4 ψ3 ψ2 ψ1 ψ0

, , , , .

(standard convention, not followed here)

(12.49)

With respect to a triple of bivector indices ordered as {u−, uv, +v}, the NP components of the Weyl tensor constitute the 3 × 3 complex symmetric matrix IS THIS NP OR COMPLEX NP? Cklmn



ψ−2 =  ψ−1 ψ0

ψ−1 ψ0 ψ1

 ψ0 ψ1  . ψ2

(12.50)

12.4.3 Newman-Penrose components of the complexified Weyl tensor The non-vanishing NP components of the complexified Weyl tensor C˜klmn defined by equation (12.44) are

C˜uvuv = C˜+−+−

C˜uvu− = C˜uv+− C˜vuv+

C˜u−u− = C˜+−u− = C˜v+−u = C˜−+v+ C˜v+v+

= = = = =

ψ−2 , ψ−1 , ψ0 , ψ1 , ψ2 .

(12.51)

whereas any component with either of its two bivector indices equal to v− or u+ vanishes. As with the complexified electromagnetic field, the rule that complex conjugation flips spin fails here because the complexification operator breaks the rule. Equations (12.51) show that the complexified Weyl tensor in an NP tetrad contains just 5 distinct non-vanishing complex components, and those components are precisely equal to the complex spin components ψs .

12.4.4 Components of the complexified Weyl tensor in an orthonormal tetrad The complexified Weyl tensor forms a 3 × 3 complex symmetric traceless matrix in any frame, not just an NP frame. In an orthonormal frame, with respect to a triple of bivector indices {tx, ty, tz}, the complexified Weyl tensor C˜klmn can be expressed in terms of the NP spin components ψs as C˜klmn

 1 − 2i (ψ1 + ψ−1 ) ψ0 2 (ψ1 − ψ−1 )  . =  21 (ψ1 − ψ−1 ) − 12 ψ0 + 41 (ψ2 + ψ−2 ) − 4i (ψ2 − ψ−2 ) i 1 1 i (ψ + ψ ) − (ψ − ψ ) − ψ − (ψ + ψ ) −2 1 −1 2 −2 2 −2 4 2 0 4 

(12.52)

186



More on the tetrad formalism

12.4.5 Propagating components of gravitational waves For outgoing gravitational waves, only the spin −2 component ψ−2 (the one conventionally called ψ4 ) propagates, carrying gravitational waves from a source to infinity: ψ−2 : propagating, outgoing .

(12.53)

This propagating, outgoing −2 component has spin −2, but its complex conjugate has spin +2, so effectively both spin components, or helicities, or circular polarizations, of an outgoing gravitational wave are embodied in the single complex component. The remaining 4 complex NP components (spins −1 to 2) of an outgoing gravitational wave are short range, describing the gravitational field near the source. Similarly, only the spin +2 component ψ2 of an ingoing gravitational wave propagates, carrying energy from infinity: ψ2 : propagating, ingoing .

(12.54)

12.5 Petrov classification of the Weyl tensor As seen above, the complexified Weyl tensor is a complex symmetric traceless 3 × 3 matrix. If the matrix were real symmetric (or complex Hermitian), then standard mathematical theorems would guarantee that it would be diagonalizable, with a complete set of eigenvalues and eigenvectors. But the Weyl matrix is complex symmetric, and there is no such theorem. The mathematical theorems state that a matrix is diagonalizable if and only if it has a complete set of linearly independent eigenvectors. Since there is always at least one distinct linearly independent eigenvector associated with each distinct eigenvalue, if all eigenvalues are distinct, then necessarily there is a complete set of eigenvectors, and the Weyl tensor is diagonalizable. However, if some of the eigenvalues coincide, then there may not be a complete set of linearly independent eigenvectors, in which case the Weyl tensor is not diagonalizable. The Petrov classification, tabulated in Table 12.1, classifies the Weyl tensor in accordance with the number of distinct eigenvalues and eigenvectors. The normal form is with respect to an orthonormal frame aligned with the eigenvectors to the extent possible. The tetrad with respect to which the complexified Weyl tensor takes its normal form is called the Weyl principal tetrad. The Weyl principal tetrad is unique except in cases D, O, and N. For Types D and N, the Weyl principal tetrad is unique up to Lorentz transformations that leave the eigen-bivector γtz unchanged, which is to say, transformations generated by the Lorentz rotor exp(ζγ γtz ) where ζ is complex. The Kerr-Newman geometry is Type D. General spherically symmetric geometries are Type D. The Friedmann-Robertson-Walker geometry is Type O. Plane gravitational waves are Type N.

12.6 Raychaudhuri equations and the Sachs optical scalars

expansion θ

rotation ω

187

shear σ

Figure 12.1 Illustrating how the Sachs optical scalars, the expansion θ, the rotation ω, and the shear σ, defined by equations (12.60), characterize the rate at which a bundle of light rays changes shape as it propagates. The bundle of light is coming vertically upward out of the paper.

12.6 Raychaudhuri equations and the Sachs optical scalars Consider a light ray. Let the γv null axis lie along the worldline of the light ray. Choose the NewmanPenrose tetrad so that the tetrad axes γm are parallel-transported along the path of the light ray. You can think of a bunch of observers arrayed along the ray each observing the same image, unprecessed, unboosted,

Table 12.1 Petrov classification of the Weyl tensor Petrov type

Distinct eigenvalues

Distinct eigenvectors

I

3

3

D

2

3

II

2

2

O

1

3

N

1

2

III

1

1

Normal form of the complexified Weyl tensor 0

1 0 A 0 − − 12 ψ0 − 21 ψ2 1 0 ψ0 0 0 1 A @ 0 − ψ0 0 2 0 0 − 21 ψ0 0 1 ψ0 0 0 @ 0 − 1 ψ0 + 1 ψ2 A − 4i ψ2 2 4 − 12 ψ0 − 41 ψ2 0 − 4i ψ2 0 1 0 0 0 @ 0 0 0 A 0 0 0 1 0 0 0 0 i 1 @ 0 ψ − 4 ψ2 A 4 2 0 − 4i ψ2 − 41 ψ2 1 0 1 0 ψ − 2i ψ1 2 1 1 A @ ψ 0 0 2 1 0 0 − 2i ψ1 ψ0 @ 0 0

0 + 21 ψ2 0

1 ψ 2 0

188



More on the tetrad formalism

unredshifted. Mathematically, this means that the tetrad axes along the worldline of the light ray satisfy ∂v γm = 0 .

(12.55)

By definition of the tetrad connections, this is equivalent to the conditions that the Newman-Penrose tetrad connections with final index v all vanish Γkmv = 0 .

(12.56)

The conditions (12.56) constitute a set of 6 conditions which define the Lorentz transformation of the tetrad axes along the worldline of the light ray. Given the conditions (12.56), the usual expression (11.45) for the Riemann tensor implies that the rate of change ∂v Γmnl of each of the 18 remaining tetrad connections Γmnl , those with final index l 6= v, along the worldline of the light ray satisfies ∂v Γmnl + Γkvl Γmnk = Rvlmn .

(12.57) √ As commented after the definition (12.13) of the NP tetrad, the null axis γv ≡ (γ γt + γz )/ 2 defines not a single direction, but rather a 2D surface spanned by the two directions γt and γz . Orthogonal to this surface is the 2D surface spanned by the transverse axes γx and γy , or equivalently by the spinor axes γ+ and γ− . Of the 18 tetrad connections Γmnl with l 6= v, four are embodied in the extrinsic curvature Kab defined by Kab ≡ γa · ∂b γv = Γavb

for a, b = +, − .

(12.58)

The extrinsic curvature (12.58) describes how the null axis γv varies over the 2D surface spanned by the transverse axes γ+ and γ− . For the extrinsic curvature Kab , the evolution equations (12.57) become ∂v Kab + K + b Ka+ + K − b Ka− = Rvbav .

(12.59)

The Sachs optical scalars constitute the components of the extrinsic curvature Kab defined by equation (12.58). Conventionally, the Sachs optical scalars consist of the expansion θ, the rotation ω, and the complex shear σ, defined in terms of the extrinsic curvature (12.58) by θ + iω ≡ K+− , σ ≡ K++ ,

(12.60a) (12.60b)

whose complex conjugates are θ − iω ≡ K−+ and σ ∗ = K−− . Resolved into real and imaginary parts, the definitions (12.60) of the Sachs optical scalars are θ ≡ 21 (K+− + K−+ ) ,

ω≡

Re σ ≡

Im σ ≡

1 2i (K+− − K−+ ) , 1 2 (K++ + K−− ) , 1 2i (K++ − K−− ) .

(12.61a) (12.61b) (12.61c) (12.61d)

Physically, the Sachs scalars characterize how the shape of a bundle of light rays evolves as it propagates, as illustrated in Figure 12.1. The expansion represents how fast the bundle expands, the rotation how fast

12.7 Focussing theorem

189

it rotates, and the shear how fast its ellipticity is changing. The amplitude and phase of the complex shear represent the amplitude and phase of the major axis of the shear ellipse. The derivation from equations (12.59) of the evolutionary equations governing the Sachs scalars θ, ω, and σ is left as Exercise 12.2. The equation (12.62b) for the shear σ shows that the shear changes only in the presence of a non-vanishing Weyl tensor, that is, in the presence of tidal forces. The equation (12.63b) for the rotation ω shows that if the rotation is initially zero, then it will remain zero along the path of the light bundle. Thus geodesic motion cannot by itself generate rotation from nothing; but non-geodesic processes, such the electromagnetic scattering of light, can generate rotation. The equation (12.63a) for the expansion θ is commonly called the Raychaudhuri equation (A. K. Raychaudhuri, 1955, “Relativistic cosmology, I,” Phys. Rev. 98, 1123). The Raychaudhuri equation is the basis of the focussing theorem, §12.7, which is a central ingredient of the Penrose-Hawking singularity theorems. Exercise 12.2 Raychaudhuri equations Show that equations (12.59) imply the following evolutionary equations for the Sachs scalars defined by (12.60): (∂v + θ + iω)(θ + iω) + σσ ∗ + 21 Gvv = 0 ,

(12.62a)

(∂v + 2θ)σ + Cv+v+ = 0 ,

(12.62b)

where Gmn and Cklmn denote the Einstein and Weyl tensors as usual. Show that the first (12.62a) of these equations is equivalent to the two equations ∂v θ + (θ2 − ω 2 ) + σσ ∗ + 21 Gvv = 0 , (∂v + 2θ)ω = 0 .

(12.63a) (12.63b) ⋄

12.7 Focussing theorem The Raychaudhuri equation (12.63a) provides the basis for the focussing theorem, which is a key ingredient of the singularity theorems introduced by Penrose (R. Penrose, 1965, “Gravitational collapse and spacetime singularities,” Phys. Rev. Lett. 14, 57–59) and elaborated extensively by Hawking (S. W. Hawking and G. F. R. Ellis, 1975, The large scale structure of space-time, Cambridge University Press). If the rotation ω vanishes, then equation (12.63a) for the expansion θ simplifies to ∂v θ + θ2 + σσ ∗ + 12 Gvv = 0 .

(12.64)

The terms θ2 and σσ ∗ are necessarily positive. The NP component Gvv of the Einstein tensor is related to the components in the parent orthonormal tetrad by Gvv = 21 Gtt + Gtz + 12 Gzz .

(12.65)

190



More on the tetrad formalism

Boosted along the z-direction into the center-of-mass frame, where Gtz = 0, equation (12.65) reduces to Gvv = 4π(ρ + pz )

(12.66)

where ρ is the energy density and pz the pressure along the z-direction. If it is true that ρ + pz ≥ 0 ,

(12.67)

then the rotation-free Raychaudhuri equation (12.64) shows that the expansion θ must always decrease.

13 ∗

The 3+1 (ADM) formalism

Einstein’s equations constitute a set of 10 coupled second-order partial differential equations. Solving these equations in a general fashion presents a formidable challenge. The 3+1, or ADM, formalism devised by R. Arnowitt, S. Deser, & C. W. Misner (1959, Phys. Rev. 116, 1322–1330; 1962, “The dynamics of general relativity,” in Gravitation: an introduction to current research, ed. L. Witten, 227–265) offers an insightful and systematic way to proceed. The formalism is widely used in numerical general relativity. For reviews, see L. Lehner (2001, “Numerical relativity: a review,” CQG 18, R25–86, gr-qc/0106072), and H. Shinkai (2008, “Formulations of the Einstein equations for numerical simulations,” APCTP winter school on black hole astrophysics, arXiv:0805.0068). The central idea of the ADM formalism is to recast the Einstein equations into Hamiltonian form. The Hamiltonian approach identifies “canonical momenta” conjugate to the “coordinates,” and converts the equations of motion from second order partial differential equations in the coordinates into coupled first order partial differential equations in the coordinates and momenta. The Hamiltonian H of a system is its “energy” expressed in terms of the coordinates and momenta. In quantum mechanics, equating the time translation operator to the Hamiltonian operator, i~ ∂/∂t = H, determines the evolution in time t of the system. To implement the Hamiltonian approach in general relativity, it is necessary to identify one coordinate, the time coordinate t, as having a special status. The system of Einstein (and other) equations is evolved by integrating from one spacelike hypersurface of constant time, t = constant, to the next. The 3+1 formalism provides answers to several basic questions about the dynamical structure of Einstein’s equations: 1. Are there natural coordinates for the gravitational field, and what are they? Answer: Yes. The natural coordinates are the 6 components of the spatial metric gαβ on the hypersurfaces of constant time. The 3+1 formalism shows that only these 6 of the 10 components of the 4D metric gµν are governed by time evolution equations. The remaining 4 degrees of freedom in the metric represent gauge freedoms associated with general coordinate transformations. 2. What happens to the 4 degrees of freedom associated with general coordinate transformations? Answer: They are accomodated into the lapse α and shift β α , which express the rate at which the unit timelike normal γ0 to the spatial hypersurfaces of constant time marches through the coordinates, equations (13.9).

192



The 3+1 (ADM) formalism

In the 3+1 formalism, the lapse and shift are specifiable arbitrarily, and are not governed by time evolution equations. 3. What are the 6 momenta conjugate to the 6 coordinates gαβ ? Answer: The momenta are, up to a factor, the components of the trace-modified extrinsic curvature Kαβ − gαβ K, equation (13.36). The extrinsic curvature Kαβ , whose tetrad-frame expression is defined by equation (13.12b), is a symmetric 3 × 3 matrix that describes how the unit timelike normal γ0 varies over the spatial hypersurfaces of constant time. 4. What is the structure of the 10 Einstein equations in the 3+1 formalism? Answer: The 6 spatial components of the Einstein equations provide dynamical time evolution equations for the 6 momenta. The 4 remaining components of the Einstein equations, the time-time and time-space components, prove to be constraint equations, called the Hamiltonian (or scalar) and momentum (or vector) constraints. The constraint equations specify conditions that must be arranged to be satisfied on the initial hypersurface of constant time, but thereafter the constraints are automatically satisfied (modulo numerical error and instabilities). 5. How is covariant conservation of energy-momentum expressed? Answer: The fact that the Hamiltonian and momentum constraints continue to be satisfied as time advances expresses covariant conservation of energy-momentum, as guaranteed by the contracted Bianchi identities. In this chapter, the coordinate time index is t, while the tetrad time index is 0. Early-alphabet greek (brown) letters α, β, ..., denote 3D spatial coordinate indices, while mid-alphabet greek letters κ, λ, ..., denote 4D spacetime coordinate indices. Early-alphabet latin (black) letters a, b, ..., denote 3D spatial tetrad indices, while mid-alphabet latin (black) letters k, l, ..., denote 4D spacetime tetrad indices.

13.1 ADM tetrad The ADM formalism splits the spacetime coordinates xµ into a time coordinate t and spatial coordinates xα , α = 1, 2, 3, xµ ≡ {t, xα } .

(13.1)

At each point of spacetime, the spacelike hypersurface of constant time t has a unique future-pointing unit normal γ0 , defined to have unit length and to be orthogonal to the spatial tangent axes gα , γ0 · γ0 = −1 ,

γ0 · gα = 0

α = 1, 2, 3 .

(13.2)

The central element of the ADM approach is to work in a tetrad frame γm consisting of this time axis γ0 , together with three spatial tetrad axes γa that are orthogonal to the tetrad time axis γ0 , and therefore lie in the 3D spatial hypersurface of constant time, γ0 · γa = 0

a = 1, 2, 3 .

The tetrad metric γmn in the ADM formalism is thus   −1 0 γmn = , 0 γab

(13.3)

(13.4)

13.2 Traditional ADM approach and the inverse tetrad metric γ mn is correspondingly   −1 0 γ mn = , 0 γ ab

193

(13.5)

whose spatial part γ ab is the inverse of γab . Given the conditions (13.2) and (13.3), the vierbein em µ and inverse vierbein em µ take the form     1/α β α /α α 0 m µ , e µ= , (13.6) em = −ea α β α ea α 0 ea α where α and β α are the lapse and shift (see next paragraph), and ea α and ea α represent the spatial vierbein and inverse vierbein, which are inverse to each other, ea α eb α = δac . The ADM metric is  (13.7) ds2 = − α2 dt2 + gαβ (dxα − β α dt) dxβ − β β dt ,

where gαβ is the spatial coordinate metric

gαβ = γab ea α eb β .

(13.8)

Essentially all the tetrad formalism developed in Chapter 11 carries through, subject only to the conditions (13.2) and (13.3). The vierbein coefficient α is called the lapse, while β α is called the shift. Physically, the lapse α is the rate at which the proper time τ of the tetrad rest frame elapses per unit coordinate time t, while the shift β α is the velocity at which the tetrad rest frame moves through the spatial coordinates xα per unit coordinate time t, dτ dxα α= , βα = . (13.9) dt dt These relations (13.9) follow from the fact that the 4-velocity in the tetrad rest frame is by definition um ≡ {1, 0, 0, 0}, so the coordinate 4-velocity uµ ≡ em µ um of the tetrad rest frame is dxµ 1 ≡ uµ = et µ = {1, β α } . dτ α

(13.10)

13.2 Traditional ADM approach The traditional ADM approach sets the spatial tetrad axes γa equal to the spatial coordinate tangent axes gα , γa = gα

(traditional ADM) ,

(13.11)

equivalent to choosing the spatial vierbein to be the unit matrix, ea α = δaα . The traditional ADM approach may be termed semi-tetrad, since it works with a tetrad time axis γ0 together with coordinate spatial axes gα . It is natural however to extend the ADM approach into a full tetrad approach, allowing the spatial tetrad axes γa to be chosen more generally, subject only to the condition (13.3) that they be orthogonal to

194



The 3+1 (ADM) formalism

the tetrad time axis, and therefore lie in the hypersurface of constant time t. For example, the spatial tetrad γa can be chosen to form 3D orthonormal axes, γab ≡ γa · γb = δab , so that the full 4D tetrad metric γmn is Minkowski. This chapter follows the full tetrad approach to the ADM formalism, but all the results hold for traditional case where the spatial tetrad axes are set equal to the coordinate spatial axes, equation (13.11).

13.3 Spatial tetrad vectors and tensors Since the tetrad time axis γ0 in the ADM formalism is defined uniquely by the choice of hypersurfaces of constant time t, there is no freedom of tetrad transformations of the time axis distinct from temporal coordinate transformations (no distinct freedom of Lorentz boosts). However, there is still freedom of tetrad transformations of the spatial tetrad axes (spatial rotations). A covariant spatial tetrad vector Aa is defined in the usual way as a vector that transforms like the spatial tetrad axes γa . Similarly a covariant spatial tetrad tensor is a tensor that transforms like products of the spatial tetrad axes γa . Indices on spatial tetrad vectors and tensors are raised with the inverse spatial tetrad metric γ ab , and lowered with the spatial tetrad metric γab . A temporal coordinate transformation changes the hypersurface of constant time t, and therefore changes its unit normal, the tetrad time axis γ0 , and correspondingly all the spatial tetrad axes γa .

13.4 ADM connections, gravity, and extrinsic curvature Since the tetrad time axis γ0 is a spatial tetrad scalar, its directed time derivative ∂0 γ0 is a spatial tetrad scalar, while its directed spatial derivatives ∂b γ0 form a spatial tetrad vector. It follows that the connections Γm0n defined by the directed derivatives of γ0 are spatial tetrad tensors. These connections play an important role in the ADM formalism, and they are given special names and symbols, the gravity κa , and the extrinsic curvature Kab (the remaining components of the directed derivatives of γ0 vanish, Γ000 = Γ00a = 0): κa ≡ γa · ∂0 γ0 = Γa00

Kab ≡ γa · ∂b γ0 = Γa0b

is a spatial tetrad vector ,

(13.12a)

is a spatial tetrad tensor .

(13.12b)

The gravity κa is justly named because the geodesic equation shows that it is minus the acceleration experienced in the tetrad rest frame, where um = {1, 0, 0, 0}, dua = −κa . dτ

(13.13)

The extrinsic curvature Kab describes how the unit normal γ0 to the 3-dimensional spatial hypersurface changes over the hypersurface, and can therefore be regarded as embodying the curvature of the 3-dimensional spatial hypersurface embedded in the 4-dimensional spacetime. From equation (11.40) with vanishing torsion,

13.5 ADM Riemann, Ricci, and Einstein tensors

195

it follows that the gravity and the extrinsic curvature are κa = d00a , Kab =

1 2

(13.14a)

(∂0 γab − dab0 − dba0 + da0b + db0a ) ,

(13.14b)

where the relevant vierbein derivatives dlmn are d00a =

1 ∂a α , α

da0b =

1 eaα ∂b β α , α

dab0 = − γac eb β ∂0 ec β .

(13.15)

Equation (13.14b) shows that the extrinsic curvature is symmetric, Kab = Kba .

(13.16)

The non-vanishing tetrad connections are, from the general formula (11.40) with vanishing torsion, Γa00 = − Γ0a0 = κa ,

(13.17a)

Γa0b = − Γ0ab = Kab ,

(13.17b)

Γab0 = Kab + dab0 − da0b ,

Γabc = same as eq. (11.40) .

(13.17c) (13.17d)

The connections (13.17a) and (13.17b) form, as commented above, a spatial tetrad vector κa and tensor Kab , but the remaining connections (13.17c) and (13.17d) are not spatial tetrad tensors. Note that the purely spatial tetrad connections Γabc , like the spatial tetrad axes γa , transform under temporal coordinate transformations despite the absence of temporal indices.

13.5 ADM Riemann, Ricci, and Einstein tensors The ADM Riemann tensor Rklmn inherits from the standard tetrad formalism the property of being a full tetrad tensor, its components transforming like products of the tetrad axes γm . Of course, since the ADM tetrad time axis γ0 is tied to the coordinates, a tetrad transformation of the time axis requires a simultaneous coordinate transformation consistent with it. The usual expression (11.45) for the tetrad-frame Riemann tensor with vanishing torsion yields the ADM Riemann tensor R0a0b = − D0 Kab − Kac Kbc + R0abc = Dc Kab − Db Kac ,

Rabcd = Kac Kbd − Kad Kbc + (3)

1 Da Db α , α

(13.18a) (13.18b)

(3) Rabcd

.

(13.18c)

Here Rabcd is the spatial tetrad Riemann tensor considered confined to the 3D spatial hypersurface (given by equation (11.45) with all time components discarded), and Dm denotes the usual 4D tetrad-frame covariant derivative, with the understanding that when acting on a spatial tensor such as Kab , all time components



196

The 3+1 (ADM) formalism

of the spatial tensor are to be considered equal to zero. Thus the 4D tetrad covariant derivative of a spatial tetrad vector Aa is Dm Aa = ∂m Aa − Γbam Ab ,

(13.19)

in which the possible Γ0am A0 term is considered to vanish. When acting on a spatial tetrad tensor, the spatial part Da of the 4D tetrad covariant derivative Dm involves only spatial connections, and is identical to the 3D spatial tetrad covariant derivative D(3) considered confined to the 3D spatial hypersurface, Da ≡ Da(3)

when acting on a spatial tetrad tensor .

(13.20)

The covariant tetrad time derivative D0 acting on a spatial tetrad tensor yields a 3D spatial tetrad tensor, but the full 4D tetrad covariant derivative Dm acting on a spatial tetrad tensor does not yield a 4D tetrad tensor. It is important to bear these fine distinctions in mind when integrating by parts, as done in §13.6. The Ricci tensor Rkm ≡ γ ln Rklmn is, like the Riemann tensor, a tetrad tensor. Its components are R00 = − ∂0 K − Kab K ab + R0a = Db Kab − ∂a K , Rab = D0 Kab + KKab −

1 Da D a α , α

(13.21a) (13.21b)

1 (3) Da Db α + Rab , α (3)

(13.21c) (3)

where K ≡ γ ab Kab is the trace of the extrinsic curvature, and Rac ≡ γ bd Rabcd is the Ricci tensor confined to the 3D spatial hypersurface. The Ricci scalar R ≡ γ km Rkm is R = 2 ∂0 K + K 2 + Kab K ab −

2 Da Da α + R(3) , α

(13.22)

(3)

where R(3) ≡ γ ab Rab is the Ricci scalar confined to the 3D spatial hypersurface. The Einstein tensor Gkm ≡ Rkm − 21 γkm R is  G00 = 12 K 2 − Kab K ab + R(3) ,

(13.23a)

b

G0a = D Kab − Da K , Gab

(3)

(13.23b)   1 1 (3) = D0 Kab + KKab − Da Db α − γab ∂0 K + 12 K 2 + 21 Kcd K cd − Dc Dc α + Gab , (13.23c) α α (3)

where Gab ≡ Rab − 21 γab R(3) is the Einstein tensor confined to the 3D spatial hypersurface.

13.6 ADM action In this chapter up to this point, the Einstein tensor and other quantities have been expressed in terms of other things, but no equation of motion has been invoked. The ADM philosophy is to derive the gravitational equations of motion — Einstein’s equations — in Hamiltonian form. The starting point is to write down the gravitational action, extremization of which will yield equations of motion.

13.6 ADM action

197

The Hilbert gravitational action Sg is Sg =

Z

tf

ti

1 Lg dx = 16π 4

Z

tf

R dx4 ,

(13.24)

ti

with scalar Lagrangian Lg equal to a normalization factor times the Ricci scalar R, Lg =

1 R. 16π

(13.25)

The integration in the action (13.24) is over a 4-volume from an initial hypersurface of constant time ti to a final hypersurface of constant time tf . The integration measure dx4 in the integral (13.24) denotes the scalar 4-volume element1 . With respect to coordinates {t, xα }, the scalar 4-volume element is dx4 = α dt dx30 = (α/e) dt d3x ,

(13.26)

where dx30 is the tetrad time component (the component along the timelike normal γ0 to the spatial hypersurface) of the vector 3-volume element dx3k , and e = |ea β | is the determinant of the spatial vierbein, so that 1/e = |ea β | is the determinant of the inverse spatial vierbein. Instead of the scalar Lagrangian Lg , it is equally possible to work with the Lagrangian density Lg , Z tf Lg dt d3x with Lg ≡ Lg α/e . (13.27) Sg = ti

Either way, the important thing is to be careful to get factors in the integration measure right when integrating by parts. The following development works with the scalar Lagrangian Lg , in which case a term integrates over a scalar 4-volume V to a 3D surface integral over the 3-boundary ∂V of the volume provided that the term is a covariant 4-divergence, I Z 4 A · dx3 . (13.28) D · A dx = V

∂V

The least action principle demands that the conditions on the initial and final hypersurfaces ti and tf be considered fixed, and asserts that the path followed by the system between the fixed initial and final conditions is such that the action is minimized. The equations of motion that result from extremizing the action are unaffected by terms in the scalar Lagrangian that are covariant 4-divergences, since these integrate to surface terms that are asserted to be fixed, and therefore unchanged by variation. The ADM expression (13.22) for the Ricci scalar involves two terms, 2∂0 K and (2/α)Da Da α, that contain second derivatives of the vierbein coefficients, and therefore demand to be integrated by parts to bring the Lagrangian into Hamiltonian form, depending only on first derivatives. To integrate the ∂0 K term by parts, it is necessary to express it in terms of a covariant 4-divergence, which is accomplished by ∂0 K = Dm K m − K 2 , 1

(13.29)

Technically, the scalar volume element dx4 is the quadvector, or pseudoscalar, 4-volume element, the differential 4-form 1 ε dxk ∧ dxl ∧ dxm ∧ dxn . Likewise the vector 3-volume element dx3k is the trivector, or pseudovector, 3-volume 4! klmn 1 element, the differential 3-form 3! εklmn dxl ∧ dxm ∧ dxn .

198



The 3+1 (ADM) formalism

where K m ≡ {K, 0, 0, 0}. Similarly, the (1/α)Da Da α term is converted to a 4-divergence by 1 D a D a α = D m κm , α

(13.30)

where κm ≡ {0, κa } with κa defined by equation (13.12a). Inserting the ADM expression (13.22) for the Ricci scalar into the Hilbert action (13.24), and integrating the ∂0 K and (1/α)Da Da α terms by parts, yields the ADM gravitational action Z tf Z Z tf   1 1 1 (13.31) Kab K ab − K 2 + R(3) dx4 . K dx30 + κa dx3a + Sg = 8π 8π ∂V 16π ti ti For the purpose of extremizing the action, the surface terms can be discarded. Thus the action to be extremized is the one with the ADM Lagrangian  1  (13.32) Kab K ab − K 2 + R(3) . LADM = 16π

According to the usual procedure, conjugate momenta are obtained as partial derivatives of the Lagrangian with respect to velocities. In the present instance, the velocities are time derivatives of the coordinates. Now the only things containing time derivatives in the ADM Lagrangian (13.32) are those contained in the terms involving the extrinsic curvature Kab (the spatial Ricci scalar R(3) contains no time derivatives). From the expression (13.14b) for the extrinsic curvature, together with equations (13.15), the time derivatives in the extrinsic curvature are the directed time derivatives ∂0 γab of the spatial tetrad metric, and the directed time derivatives ∂0 ec β of the spatial inverse vierbein. However, these time derivatives appear only the combination ∂0 γab − dab0 − dba0 = ea α eb β ∂0 (γcd ec α ed β ) = ea α eb β ∂0 gαβ .

(13.33)

Thus the ADM Lagrangian picks out the natural coordinates as being the spatial components gαβ of the coordinate metric, since only time derivatives of these appear in the ADM Lagrangian. An expression for the extrinsic curvature that demonstrates explicitly its dependence on the time derivatives of the spatial coordinate metric gαβ is, from manipulating equation (13.14b),   1 ∂gαβ Kab = ea α eb β − Da ebt − Db eat . (13.34) 2α ∂t Conjugate momenta are obtained by differentiating the Lagrangian with respect to the velocities. The derivatives of the extrinsic curvature with respect to the velocities g˙ αβ ≡ ∂gαβ /∂t are 1 α β ∂Kab = ea eb . ∂ g˙ αβ 2α

(13.35)

The conjugate momenta π αβ are therefore π αβ ≡ α

 ∂LADM 1 = K αβ − g αβ K , ∂ g˙ αβ 16π

(13.36)

13.7 ADM equations of motion

199

in which the factor of α is introduced to convert the conjugate momenta into a tensor, as opposed to a tensor density. Projected into the tetrad frame, the conjugate momenta are π ab = ea α eb β π αβ =

 1 K ab − γ ab K . 16π

(13.37)

The ADM Lagrangian (13.32) can be rewritten in terms of the conjugate momenta (13.37) as LADM = 2Kab π ab −

G00 , 8π

(13.38)

in which the 3D Ricci scalar R(3) has been eliminated in favour of the time-time component G00 of the tetradframe Einstein tensor by equation (13.23a). Substituting the expression (13.34) for the extrinsic curvature Kab brings the ADM Lagrangian to   G0 1 α β ∂gαβ ea eb − Da ebt − Db eat π ab − 0 . (13.39) LADM = α ∂t 8π The two terms Da ebt and Db eat can be combined into one because of the symmetry of π ab , and integrated by parts to give the ADM action  Z tf  Z Z tf G00 ab 3 ab α β ∂gαβ ab 4 ea eb dt dx30 . π + 2 eat Db π − α ebt π dxa + LADM dx = −2 SADM ≡ ∂t 8π ti ∂V ti (13.40) Once again, for the purposes of extremizing the action, the surface term can be discarded. Eliminating the Db π ab term in favour of the time-space part G0a of the tetrad-frame Einstein tensor by equation (13.23b) produces  Z tf  0 0 α β ∂gαβ ab a Ga 0 G0 dt dx30 . (13.41) ea eb π −e t −e t SADM = ∂t 8π 8π ti This is the ADM gravitational action in desired Hamiltonian form. Compactly,  Z tf  ∂gαβ αβ SADM = π − H dt dx30 ∂t ti

(13.42)

where H is the Hamiltonian H≡

em t G0m G0 = t . 8π 8π

(13.43)

13.7 ADM equations of motion Equations of motion follow from extremizing the ADM action (13.42) in Hamiltonian form. To accomplish this, the Hamiltonian H must be expressed in terms of the coordinates and momenta. The expression (13.42) for the ADM action shows that the natural dynamical coordinates and momenta of the system are the



200

The 3+1 (ADM) formalism

components of the spatial coordinate-frame metric gαβ and their conjugate momenta π αβ . In terms of these, the Hamiltonian is    R(3)    1 0 H= − β α 2 Dβ παβ , αG00 + β α Gα = α 16π παβ π αβ − (πγγ )2 − (13.44) 8π 16π

0 where the expressions for G00 and Gα are from equations (13.23a) and (13.23b) recast into coordinate-frame quantities, the extrinsic curvature Kαβ being eliminated in favour of the momenta παβ by equation (13.36). The expression (13.44) for the Hamiltonian depends not only on the coordinates gαβ and momenta π αβ , but also on the lapse α and shift β α , which are independent of gαβ and π αβ . Consequently the lapse and shift must also be treated as additional coordinates. The Hamiltonian (13.44) depends on the lapse and shift linearly, the quantities in braces in equation (13.44) being independent of the lapse and shift. The Hamiltonian (13.44) also depends on spatial derivatives ∂gαβ /∂xγ of the coordinates through its dependence on the Riemann 3-scalar R(3) and on the spatial connections Γα βγ associated with the covariant spatial coordinate derivative Dβ in the term Dβ παβ . However, the spatial derivatives ∂gαβ /∂xγ are to be considered as determined by the coordinates gαβ as a function of the spatial coordinates xγ on a spatial hypersurface of constant time t, not as independent coordinates to be varied separately. Variation of the ADM action (13.42) with respect to the lapse α, the shift β α , the coordinates gαβ , and the momenta π αβ , gives tf Z 3 αβ (13.45) δSADM = π δgαβ dx0 ti

+

Z

tf

ti



∂H α ∂H δβ − δα + ∂α ∂β α



∂π αβ ∂H + ∂t ∂gαβ



δgαβ +



∂gαβ ∂H − ∂t ∂π αβ



δπ αβ



dt dx30 .

The least action principle asserts that the action is minimized along the actual path taken by the system between fixed initial and final conditions at ti and tf . Setting the variation (13.45) of the action equal to zero with respect to arbitrary variations δα and δβ α of the lapse and shift yields the constraint equations ∂H =0, ∂α

∂H =0. ∂β α

(13.46)

Setting the variation (13.45) of the action equal to zero with respect to arbitrary variations δgαβ and δπ αβ of the coordinates and momenta yields Hamilton’s equations ∂π αβ ∂H , =− ∂t ∂gαβ

∂gαβ ∂H . = ∂t ∂π αβ

13.8 Constraints and energy-momentum conservation

(13.47)

14 ∗

The geometric algebra

The geometric algebra is an intuitively appealing formalism that draws together several mathematical threads relevant to special and general relativity: 1. What is the best way to conceptualize Lorentz transformations (§14.6), and to implement them on a computer (§14.20)? 2. How is it that bivectors in 4D spacetime have a natural complex structure, which has been seen to be the heart of the Newman-Penrose formalism? See §14.17.

3. How can spin- 12 objects be incorporated into general relativity? What is a Dirac spinor? See §14.23.

4. How and why do differential forms work? This chapter starts by setting up the geometric algebra in n-dimensional Euclidean space Rn , then generalizes to Minkowski space, where the geometric algebra is called the spacetime algebra. The 4D spacetime algebra proves to be identical to the Clifford algebra of the Dirac γ-matrices (which explains the adoption of the symbol γm to denote the basis vectors of a tetrad). Although the formalism is presented initially in Euclidean or Minkowski space, everything generalizes immediately to general relativity, where the basis vectors γm form the basis of an orthonormal tetrad at each point of spacetime. One convention adopted here, which agrees with the convention adopted by OpenGL and the computer graphics industry, but is opposite to the standard physics convention, is that a rotor R rotates a multivector ¯ a as a → RaR, equation (14.35). This, along with the standard definition (14.16) for the pseudoscalar, has the consequence that a right-handed rotation corresponds to R = eiθ/2 with θ increasing, and that rotations accumulate to the right, that is, a rotation R followed by a rotation S is the product RS. By contrast, in the ¯ a right-handed rotation corresponds to R = e−iθ/2 , and rotations standard physics convention a → RaR, accumulate to the left, that is, R followed by S is SR. The convention adopted here also means that a Weyl ¯ not to R as in the standard physics convention. or Dirac spinor ϕ is isomorphic to a scaled reverse rotor R, In this chapter, boldface denotes a multivector. A rotor is written in normal (not bold) face as a reminder that, even though a rotor is an even member of the geometric algebra, it can also be regarded as a spin- 12 object with a transformation law (14.37) different from that (14.35) of multivectors. Later Latin indices m, n, ... run over both time and space indices 0, 1, 2, 3, while earlier Latin indices i, j, k run over spatial indices 1, 2, 3 only.



202

The geometric algebra

c b

a

b θ

a

a

Figure 14.1 Multivectors of grade 1, 2, and 3: a vector a (left), a bivector a ∧ b (middle), and a trivector a ∧ b ∧ c (right).

14.1 Products of vectors In 3-dimensional Euclidean space R3 , there are two familiar ways of taking the product of two vectors, the scalar product and the vector product. 1. The scalar product a · b, also known as the dot product or inner product, of two vectors a and b is a scalar of magnitude |a| |b| cos θ, where |a| and |b| are the lengths of the two vectors, and θ the angle between them. The scalar product is commutative, a · b = b · a. 2. The vector product, a × b, also known as the cross product, is a vector of magnitude |a| |b| sin θ, directed perpendicular to both a and b, such that a, b, and a × b form a right-handed set. The vector product is anticommutative, a × b = −b × a. The definition of the scalar product continues to work fine in a Euclidean space of any dimension, but the definition of the vector product works only in three dimensions, because in two dimensions there is no vector perpendicular to two vectors, and in four or more dimensions there are many vectors perpendicular to two vectors. It is therefore useful to define a more general version, the outer product (H. Grassmann, 1862 Die Ausdehnungslehre, Berlin) that works in Euclidean space Rn of any dimension. 3. The outer product a ∧ b, also known as the wedge product, of two vectors a and b is a bivector, a multivector of dimension 2, or grade 2. The bivector a ∧ b is the directed 2-dimensional area, of magnitude |a| |b| sin θ, of the parallelogram formed by the vectors a and b, as illustrated in Figure 14.1. The bivector has an orientation, or handedness, defined by circulating the parallelogram first along a, then along b. The outer product is anticommutative, a ∧ b = −b ∧ a, like its forebear the vector product. The outer product can be repeated, so that (a ∧ b) ∧ c is a trivector, a directed volume, a multivector of grade 3. The magnitude of the trivector is the volume of the parallelepiped defined by the vectors a, b, and c, illustrated in Figure 14.1. The outer product is by construction associative, (a ∧ b) ∧ c = a ∧ (b ∧ c). Associativity, together with anticommutativity of bivectors, implies that the trivector a ∧ b ∧ c is totally antisymmetric under permutations of the three vectors, that is, it is unchanged under even permutations, and changes sign under odd permutations. The ordering of an outer product thus defines one of two handednesses.

14.2 Geometric product

203

It is a familiar concept that a vector a can be regarded as a geometric object, a directed length, independent of the coordinates used to describe it. The components of a vector change when the reference frame changes, but the vector itself remains the same physical thing. In the same way, a bivector a ∧ b is a directed area, and a trivector a ∧ b ∧ c is a directed volume, both geometric objects with a physical meaning independent of the coordinate system. In two dimensions the triple outer product of any three vectors is zero, a ∧ b ∧ c = 0, because the volume of a parallelepiped confined to a plane is zero. More generally, in n-dimensional space Rn , the outer product of n + 1 vectors is zero a1 ∧ a2 ∧ · · · ∧ an+1 = 0

(n dimensions) .

(14.1)

14.2 Geometric product The inner and outer products offer two different ways of multiplying vectors. However, by itself neither product conforms to the usual desideratum of multiplication, that the product of two elements of a set be an element of the set. Taking the inner product of a vector with another vector lowers the dimension by one, while taking the outer product raises the dimension by one. H. Grassmann H. (1877, “Der ort der Hamilton’schen quaternionen in der audehnungslehre,” Math. Ann. 12, 375) and W. K. Clifford (1878, “Applications of Grassmann’s extensive algebra,” Am. J. Math. 1, 350) resolved the problem by defining a multivector as any linear combination of scalars, vectors, bivectors, and objects of higher grade. Let γ1 , γ2 , ..., γn form an orthonormal basis for n-dimensional Euclidean space Rn . A multivector in n = 2 dimensions is then a linear combination of 1, 1 scalar

γ1 , γ2 , 2 vectors

γ1 ∧ γ2 , 1 bivector

(14.2)

forming a linear space of dimension 1 + 2 + 1 = 4 = 22 . Similarly, a multivector in n = 3 dimensions is a linear combination of 1, 1 scalar

γ1 , γ2 , γ3 , 3 vectors

γ1 ∧ γ2 , γ2 ∧ γ3 , γ3 ∧ γ1 , 3 bivectors

γ1 ∧ γ2 ∧ γ3 , 1 trivector

(14.3)

forming a linear space of dimension 1 + 3 + 3 + 1 = 8 = 23 . In general, multivectors in n dimensions form a linear space of dimension 2n , with n!/[m!(n−m)!] distinct basis elements of grade m. A multivector a in n-dimensional Euclidean space Rn can thus be written as a linear combination of basis elements X aij...m γi ∧ γj ∧ ... ∧ γm (14.4) a= distinct {i,j,...,m} ⊆ {1,2,...,n}

the sum being over all 2n distinct subsets of {1, 2, ..., n}. The index on each component aij...m is a totally antisymmetric quantity, reflecting the total antisymmetry of γi ∧ γj ∧ ... ∧ γm . The point of introducing multivectors is to allow multiplication to be defined so that the product of two

204



The geometric algebra

multivectors is a multivector. The key trick is to define the geometric product ab of two vectors a and b to be the sum of their inner and outer products: ab = a · b + a ∧ b .

(14.5)

That is a seriously big trick, and if you buy a ticket to it, you are in for a seriously big ride. As a particular example of (14.5), the geometric product of any element γi of the orthonormal basis with itself is a scalar, and with any other element of the basis is a bivector:  1 (i = j) γi γj = (14.6) γi ∧ γj (i 6= j) . Conversely, the rules (14.6), plus distributivity, imply the multiplication rule (14.5). A generalization of the rule (14.6) completes the definition of the geometric product: γi γj ...γ γm = γi ∧ γj ∧ ... ∧ γm

(i, j, ..., m all distinct) .

(14.7)

The rules (14.6) and (14.7), along with the usual requirements of associativity and distributivity, combined with commutativity of scalars and anticommutativity of pairs of γi , uniquely define multiplication over the space of multivectors. For example, the product of the bivector γ1 ∧ γ2 with the vector γ1 is (γ γ1 ∧ γ2 ) γ1 = γ1 γ2 γ1 = −γ γ2 γ1 γ1 = −γ γ2 .

(14.8)

Sometimes it is convenient to denote the wedge product (14.7) of distinct basis elements by the abbreviated symbol γij...m γij...m ≡ γi ∧ γj ∧ ... ∧ γm

(i, j, ..., m all distinct) .

(14.9)

By construction, γij...m is antisymmetric in its indices. The product of two general multivectors a = aα γα and b = bα γα , with paired indices implicitly summed over distinct subsets of {1, ..., n}, is ab = aα bβ γα γβ .

(14.10)

Does the geometric algebra form a group under multiplication? No. One of the defining properties of a group is that every element should have an inverse. But, for example, (1 + γ1 )(1 − γ1 ) = 0

(14.11)

shows that neither 1 + γ1 nor 1 − γ1 has an inverse.

14.3 Reverse The reverse of any basis element is defined to be the reversed product γi ∧ γj ∧ ... ∧ γm ≡ γm ∧ ... ∧ γj ∧ γi .

(14.12)

14.4 The pseudoscalar and the Hodge dual

205

¯ of any multivector a is the multivector obtained by reversing each of its components. Reversion The reverse a leaves unchanged all multivectors whose grade is 0 or 1, modulo 4, and changes the sign of all multivectors whose grade is 2 or 3, modulo 4. For example, scalars and vectors are unchanged by reversion, but bivectors and trivectors change sign. Reversion satisfies ¯, ¯ +b a+b=a

(14.13)

¯a . ab = b¯

(14.14)

Among other things, it follows that the reverse of any product of multivectors is the reversed product, as you would hope: ¯a . ab ... c = c¯ ... b¯ (14.15)

14.4 The pseudoscalar and the Hodge dual Orthogonal to any m-dimensional subspace of n-dimensional space is an (n−m)-dimensional space, called the Hodge dual space. For example, the Hodge dual of a bivector in 2 dimensions is a 0-dimensional object, a pseudoscalar. Similarly, the Hodge dual of a bivector in 3 dimensions is a 1-dimensional object, a pseudovector. Define the pseudoscalar in in n dimensions to be in ≡ γ1 ∧ γ2 ∧ ... ∧ γn

(14.16)

¯in = (−)[n/2] γ1 ∧ γ2 ∧ ... ∧ γn .

(14.17)

with reverse

The quantity [n/2] in equation (14.17) signifies the largest integer less than or equal to n/2. The square of the pseudoscalar is  1 if n = (0 or 1) modulo 4 2 [n/2] in = (−) = (14.18) −1 if n = (2 or 3) modulo 4 . The pseudoscalar anticommutes (commutes) with vectors a, that is, with multivectors of grade 1, if n is even (odd): in a = −ain in a = ain

if n is even if n is odd .

(14.19)

This implies that the pseudoscalar in commutes with all even grade elements of the geometric algebra, and that it anticommutes (commutes) with all odd elements of the algebra if n is even (odd). Exercise 14.1 Prove that the only multivectors that commute with all elements of the algebra are linear combinations of the scalar 1 and, if n is odd, the pseudoscalar in . ⋄

206



The geometric algebra

The Hodge dual ∗a of a multivector a in n dimensions is defined by premultiplication by the pseudoscalar in , ∗

a ≡ in a .

(14.20)

In 3 dimensions, the Hodge duals of the basis vectors γi are the bivectors i3 γ 1 = γ 2 ∧ γ 3 ,

i3 γ 2 = γ 3 ∧ γ 1 ,

i3 γ 3 = γ 1 ∧ γ 2 .

(14.21)

Thus in 3 dimensions the bivector a ∧ b is seen to be the pseudovector Hodge dual to the familiar vector product a × b: a ∧ b = i3 a × b .

(14.22)

14.5 Reflection Multiplying a vector (a multivector of grade 1) by a vector shifts the grade (dimension) of the vector by ±1. Thus, if one wants to transform a vector into another vector (with the same grade, one), at least two multiplications by a vector are required. The simplest non-trivial transformation of a vector a is n : a → nan

(14.23)

in which the vector a is multiplied on both left and right with a unit vector n. If a is resolved into components ak and a⊥ respectively parallel and perpendicular to n, then the transformation (14.23) is n : ak + a⊥ → ak − a⊥

(14.24)

which represents a reflection of the vector a through the axis n, a reversal of all components of the vector

nan

n a

−nan Figure 14.2 Reflection of a vector a through axis n.

14.6 Rotation

207

perpendicular to n, as illustrated by Figure 14.2. Note that −nan is the reflection of a through the hypersurface normal to n, a reversal of the component of the vector parallel to n. The operation of left- and right-multiplying by a unit vector n reflects not only vectors, but multivectors a in general: n : a → nan .

(14.25)

For example, the product ab of two vectors transforms as n : ab → n(ab)n = (nan)(nbn)

(14.26)

2

which works because n = 1. A reflection leaves any scalar λ unchanged, n : λ → nλn = λn2 = λ. Geometrically, a reflection preserves the lengths of, and angles between, all vectors.

14.6 Rotation Two successive reflections yield a rotation. Consider reflecting a vector a (a multivector of grade 1) first through the unit vector m, then through the unit vector n: mn : a → nmamn .

(14.27)

Any component a⊥ of a simultaneously orthogonal to both m and n (i.e. m · a⊥ = n · a⊥ = 0) is unchanged by the transformation (14.27), since each reflection flips the sign of a⊥ : mn : a⊥ → nma⊥ mn = −na⊥ n = a⊥ .

(14.28)

Rotations inherit from reflections the property of preserving the lengths of, and angles between, all vectors. Thus the transformation (14.27) must represent a rotation of those components ak of a lying in the 2-dimensional plane spanned by m and n, as illustrated by Figure 14.3. To determine the angle by which the plane is rotated, it suffices to consider the case where the vector ak is equal to m (or n, as a check). It is not too hard to figure out that, if the angle from m to n is θ/2, then the rotation angle is θ in the same sense, from m to n.

mam

m n a θ

nmamn

Figure 14.3 Rotation of a vector a by the bivector mn. Baffled? Hey, draw your own picture.

208



The geometric algebra

For example, if m and n are parallel, so that m = ±n, then the angle between m and n is θ/2 = 0 or π, and the transformation (14.27) rotates the vector ak by θ = 0 or 2π, that is, it leaves ak unchanged. This makes sense: two reflections through the same plane leave everything unchanged. If on the other hand m and n are orthogonal, then the angle between them is θ/2 = ±π/2, and the transformation (14.27) rotates ak by θ = ±π, that is, it maps ak to −ak . The rotation (14.27) can be abbreviated ¯ R : a → RaR

(14.29)

¯ = nm is its reverse. Rotors are unimodular, satisfying RR ¯ = where R = mn is called a rotor, and R ¯ = 1. According to the discussion above, the transformation (14.29) corresponds to a rotation by angle RR θ in the m–n plane if the angle from m to n is θ/2. Then m · n = cos θ/2 and m ∧ n = (γ γ1 ∧ γ2 ) sin θ/2, where γ1 and γ2 are two orthonormal vectors spanning the m–n plane, oriented so that the angle from γ1 to γ2 is positive π/2 (i.e. γ1 is the x-axis and γ2 the y-axis). Note that the outer product γ1 ∧ γ2 is invariant under rotations in the m–n plane, hence independent of the choice of orthonormal basis vectors γ1 and γ2 . It follows that the rotor R corresponding to a right-handed rotation by θ in the γ1 –γ γ2 plane is given by R = cos

θ θ . + (γ γ1 ∧ γ2 ) sin 2 2

(14.30)

It is straightforward to check that the rotor (14.30) rotates the basis vectors γi as ¯ γ1 R = γ1 cos θ + γ2 sin θ , R : γ1 → Rγ ¯ γ2 R = γ2 cos θ − γ1 sin θ , R : γ2 → Rγ ¯ γi R = γi (i 6= 1, 2) , R : γi → Rγ

(14.31a) (14.31b) (14.31c)

which indeed corresponds to a right-handed rotation by angle θ in the γ1 –γ γ2 plane. The inverse rotation is ¯ : a → RaR ¯ R

(14.32)

θ ¯ = cos θ − (γ γ1 ∧ γ2 ) sin . R 2 2

(14.33)

with

A rotation of the form (14.30), a rotation in a single plane, is called a simple rotation. A rotation first by R and then by S transforms a vector a as ¯ a RS = RS a RS . RS : a → S¯R

(14.34)

Thus the composition of two rotations, first R and then S, is given by their geometric product RS. In three dimensions or less, all rotations are simple, but in four dimensions or higher, compositions of simple rotations can yield rotations that are not simple. For example, a rotation in the γ1 –γ γ2 plane followed by a rotation in the γ3 –γ γ4 plane is not equivalent to any simple rotation. However, it will be seen in §14.17 that bivectors in the 4D spacetime algebra have a natural complex structure, which allows 4D spacetime rotations to take

14.7 A rotor is a spin- 21 object

209

a simple form similar to (14.30), but with complex angle θ and two orthogonal planes of rotation combined into a complex pair of planes. Simple rotors are both even and unimodular, and composition preserves those properties. A rotor R is ¯ = 1) element of the geometric algebra. The set of rotors defined in general to be any even, unimodular (RR defines a group the rotor group, also referred to here as the rotation group. A rotor R rotates not only vectors, but multivectors a in general: ¯ . R : a → RaR

(14.35)

For example, the product ab of two vectors transforms as ¯ ¯ ¯ R : ab → R(ab)R = (RaR)( RbR)

(14.36)

¯ = 1. which works because RR Concept question 14.2 If vectors rotate twice as fast as rotors, do bivectors rotate twice as fast as vectors? What happens to a bivector when you rotate it by π radians? Construct a mental picture of a rotating bivector. ⋄ To summarize, the characterization of rotations by rotors has considerable advantages. Firstly, the transformation (14.35) applies to multivectors a of arbitrary grade in arbitrarily many dimensions. Secondly, the composition law is particularly simple, the composition of two rotations being given by their geometric product. A third advantage is that rotors rotate not only vectors and multivectors, but also spin- 12 objects — indeed rotors are themselves spin- 21 objects — as might be suspected from the intriguing factor of 12 in front of the angle θ in equation (14.30).

14.7 A rotor is a spin- 21 object A rotor was defined in the previous section, §14.6, as an even, unimodular element of the geometric algebra. ¯ As a multivector, a rotor R would transform under a rotation by the rotor S as R → SRS. As a rotor, however, the rotor R transforms under a rotation by the rotor S as S : R → RS ,

(14.37)

according to the transformation law (14.34). That is, composition in the rotor group is defined by the transformation (14.37): R rotated by S is RS. The expression (14.30) for a simple rotation in the γ1 –γ γ2 plane shows that the rotor corresponding to a rotation by 2π is −1. Thus under a rotation (14.37) by 2π, a rotor R changes sign: 2π : R → −R .

(14.38)

A rotation by 4π is necessary to bring the rotor R back to its original value: 4π : R → R .

(14.39)

210



The geometric algebra

Thus a rotor R behaves like a spin- 21 object, requiring 2 full rotations to restore it to its original state. The two different transformation laws for a rotor — as a multivector, and as a rotor — describe two different physical situations. The transformation of a rotor as a multivector answers the question, what is the form of a rotor R rotated into another, primed, frame? In the unprimed frame, the rotor R transforms ¯ ¯ a multivector a to RaR. In the primed frame rotated by rotor S from the unprimed frame, a′ = SaS, the ¯ transformed rotor is SRS, since ¯ ¯ ¯ ′ SRS ¯ a′ = SaS → S¯RaRS = S¯RSa .

(14.40)

By contrast, the transformation (14.37) of a rotor as a rotor answers the question, what is the rotor corresponding to a rotation R followed by a rotation S?

14.8 A multivector rotation is an active rotation In most of the rest of this book, indices indicate how an object transforms, so that the notation am γ m

(14.41)

indicates a scalar, an object that is unchanged by a transformation, because the transformation of the contravariant vector am cancels against the corresponding transformation of the covariant vector γm . However, the transformation (14.35) of a multivector is to be understood as an active transformation that rotates the basis vectors γα while keeping the coefficients aα fixed, as opposed to a passive transformation that rotates the tetrad while keeping the thing itself unchanged. Thus a multivector a ≡ aα γα (implicit summation over α ⊆ {1, ..., n}) is not a scalar under the transformation (14.35), but rather transforms to the multivector a′ ≡ aα γα′ given by ¯ γα R = a α γ ′ . R : aα γα → aα Rγ α

(14.42)

An explicit example is the transformation (14.31) of the tetrad axes γi under a right-handed rotation by angle θ.

14.9 2D rotations and complex numbers Section 14.6 identified the rotation group in n dimensions with the geometric subalgebra of even, unimodular multivectors. In two dimensions, the even grade multivectors are linear combinations of the basis set 1, 1 scalar

i2 , 1 bivector (pseudoscalar)

(14.43)

forming a linear space of dimension 2. The sole bivector is the pseudoscalar i2 ≡ γ1 ∧ γ2 , equation (14.16), the highest grade element in 2 dimensions. The rotor R that generates a right-handed rotation by angle θ

14.9 2D rotations and complex numbers

211

is, according to equation (14.30), R = eθ/2 = ei2 θ/2 = cos

θ θ + i2 sin , 2 2

(14.44)

where θ = i2 θ is the bivector of magnitude θ. Since the square of the pseudoscalar i2 is minus one, the pseudoscalar resembles the pure imaginary i, the square root of −1. Sure enough, the mapping i2 ↔ i

(14.45)

defines an isomorphism between the algebra of even grade multivectors in 2 dimensions and the field of complex numbers a + i2 b ↔ a + i b .

(14.46)

With the isomorphism (14.46), the rotor R that generates a right-handed rotation by angle θ is equivalent to the complex number R = eiθ/2 .

(14.47)

¯ = e−iθ/2 , R

(14.48)

¯ is The associated reverse rotor R

the complex conjugate of R. The group of 2D rotors is isomorphic to the group of complex numbers of unit magnitude, the unitary group U (1), 2D rotors ↔ U (1) .

(14.49)

Let z denote an even multivector, equivalent to some complex number by the isomorphism (14.46). According to the transformation formula (14.35), under the rotation R = eiθ/2 , the even multivector, or complex number, z transforms as R : z → e−iθ/2 z eiθ/2 = e−iθ/2 eiθ/2 z = z

(14.50)

which is true because even multivectors in 2 dimensions commute, as complex numbers should. Equation (14.50) shows that the even multivector, or complex number, z is unchanged by a rotation. This might seem strange: shouldn’t the rotation rotate the complex number z by θ in the Argand plane? The an¯ swer is that the rotation R : a → RaR rotates vectors γ1 and γ2 (Exercise 14.3), as already seen in the transformation (14.31). The same rotation leaves the scalar 1 and the bivector i2 ≡ γ1 ∧ γ2 unchanged. If temporarily you permit yourself to think in 3 dimensions, you see that the bivector γ1 ∧ γ2 is Hodge dual to the pseudovector γ1 × γ2 , which is the axis of rotation and is itself unchanged by the rotation, even though the individual vectors γ1 and γ2 are rotated. Exercise 14.3

Confirm that a right-handed rotation by angle θ rotates the axes γi by R : γ1 → e−iθ/2 γ1 eiθ/2 = γ1 cos θ + γ2 sin θ , R : γ2 → e

−iθ/2

γ2 e

iθ/2

= γ2 cos θ − γ1 sin θ ,

(14.51a) (14.51b)

212



The geometric algebra

in agreement with (14.31). The important thing to notice is that the pseudoscalar i2 , hence i, anticommutes with the vectors γi . ⋄

14.10 Quaternions A quaternion can be regarded as a kind of souped-up complex number q = a + ıb1 + b2 + kb3 ,

(14.52)

where a and bi (i = 1, 2, 3) are real numbers, and the three imaginary numbers ı, , k, also denoted ı1 , ı2 , ı3 here for convenience and brevity, are defined to satisfy1 ı2 = 2 = k 2 = −ık = −1 .

(14.53)

Remark the dotless ı, to distinguish these quaternionic imaginaries from other possible imaginaries. A consequence of equations (14.53) is that each pair of imaginary numbers anticommutes: ı = −ı = −k ,

k = −k = −ı ,

kı = −ık = − .

(14.54)

A quaternion (14.52) can be expressed compactly as a sum of its scalar, a, and vector (actually pseudovector, as will become apparent below from the isomorphism (14.68)), ı · b, parts q =a+ı·b ,

(14.55)

where ı is shorthand for the triple of quaternionic imaginaries, ı ≡ {ı, , k} ≡ {ı1 , ı2 , ı3 } ,

(14.56)

and where b ≡ {b1 , b2 , b3 }, and ı · b ≡ ıi bi (implicit summation over i = 1, 2, 3) is the usual Euclidean dot product. A fundamentally useful formula, which follows from the defining equations (14.53), is (ı · a)(ı · b) = −a · b − ı · (a × b)

(14.57)

where a × b is the usual 3D vector product. The product of two quaternions p ≡ a + ı · b and q ≡ c + ı · d can thus be written pq = (a + ı · b)(c + ı · d) = ac − b · d + ı · (ad + cb − b × d) .

(14.58)

The quaternionic conjugate q¯ of a quaternion q ≡ a + ı · b is (the overbar symbol ¯ for quaternionic conjugation distinguishes it from the asterisk symbol ∗ for complex conjugation) q¯ = a − ı · b . 1

(14.59)

The choice ık = 1 in the definition (14.53) is the opposite of the conventional definition ijk = −1 famously carved by W. R. Hamilton in the stone of Brougham Bridge while walking with his wife along the Royal Canal to Dublin on 16 October 1843 (S. O’Donnell, 1983, William Rowan Hamilton: Portrait of a Prodigy, Boole Press, Dublin). To map to Hamilton’s definition, you can take ı = −i,  = −j, k = −k, or alternatively ı = i,  = −j, k = k, or ı = k,  = j, k = i. The adopted choice ık = 1 has the merit that it avoids a treacherous minus sign in the isomorphism (14.68) between 3-dimensional pseudovectors and quaternions. The present choice also conforms to the convention used by OpenGL and other computer graphics programs.

14.11 3D rotations and quaternions

213

The quaternionic conjugate of a product is the reversed product of quaternionic conjugates pq = q¯p¯

(14.60)

just like reversion in the geometric algebra, equation (14.14) (the choice of the same symbol, an overbar, to represent both reversion and quaternionic conjugation is not coincidental). The magnitude |q| of the quaternion q ≡ a + ı · b is |q| = (¯ q q)1/2 = (q q¯)1/2 = (a2 + b · b)1/2 = (a2 + b21 + b22 + b23 )1/2 .

(14.61)

The inverse q −1 of the quaternion, satisfying qq −1 = q −1 q = 1, is q −1 = q¯/(¯ qq) = (a − ı · b)/(a2 + b · b) = (a − ı1 b1 − ı2 b2 − ı3 b3 )/(a2 + b21 + b22 + b23 ) .

(14.62)

14.11 3D rotations and quaternions As before, the rotation group is the group of even, unimodular multivectors of the geometric algebra. In three dimensions, the even grade multivectors are linear combinations of the basis set 1, 1 scalar

i3 γ 1 , i3 γ 2 , i3 γ 3 , 3 bivectors (pseudovectors)

(14.63)

forming a linear space of dimension 4. The three bivectors are pseudovectors, equation (14.21). The squares of the pseudovector basis elements are all minus one, (i3 γ1 )2 = (i3 γ2 )2 = (i3 γ3 )2 = −1 ,

(14.64)

and they anticommute with each other, (i3 γ1 )(i3 γ2 ) = −(i3 γ2 )(i3 γ1 ) = −i3 γ3 ,

(i3 γ2 )(i3 γ3 ) = −(i3 γ3 )(i3 γ2 ) = −i3 γ1 ,

(14.65)

(i3 γ3 )(i3 γ1 ) = −(i3 γ1 )(i3 γ3 ) = −i3 γ2 .

The rotor R that generates a rotation by angle θ right-handedly about unit vector n in 3 dimensions is, according to equation (14.30), R = eθ/2 = ei3 n θ/2 = cos

θ θ + i3 n sin . 2 2

(14.66)

where θ is the bivector θ ≡ i3 n θ

(14.67)

of magnitude θ and unit vector direction n ≡ γi ni . Comparison of equations (14.64) and (14.65) to equations (14.53) and (14.54), shows that the mapping i 3 γ i ↔ ıi

(i = 1, 2, 3)

(14.68)



214

The geometric algebra

defines an isomorphism between the space of even multivectors in 3 dimensions and the non-commutative division algebra of quaternions a + i 3 γ i b i ↔ a + ıi b i .

(14.69)

With the equivalence (14.69), the rotor R that generates a rotation by angle θ right-handedly about unit vector n in 3 dimensions is equivalent to the quaternion R = eθ/2 = eı·n θ/2 = cos

θ θ , + ı · n sin 2 2

(14.70)

where θ is the pseudovector quaternion θ ≡ ı · n θ ≡ (ı1 n1 + ı2 n2 + ı3 n3 ) θ

(14.71)

¯ is whose magnitude is |θ| = θ and whose unit direction is θˆ ≡ θ/θ = ı · n. The associated reverse rotor R ¯ = e−θ/2 = e−ı·n θ/2 = cos θ − ı · n sin θ , R 2 2

(14.72)

the quaternionic conjugate of R. The group of rotors is isomorphic to the group of unit quaternions, quaternions q = a + ı1 b1 + ı2 b2 + ı3 b3 satisfying q q¯ = a2 + b21 + b22 + b23 = 1. Unit quaternions evidently define a unit 3-sphere in the 4-dimensional space of coordinates {a, b1 , b2 , b3 }. From this it is apparent that the rotor group in 3 dimensions has the geometry of a 3-sphere S 3 . Exercise 14.4 This exercise is a precursor to Exercise 14.15. Let b ≡ γi bi be a 3D vector, a multivector of grade 1 in the 3D geometric algebra. Use the quaternionic composition rule (14.57) to show that the vector b transforms under a right-handed rotation by angle θ about unit direction n = γi ni as   ¯ b R = b + 2 sin θ n × cos θ b + sin θ n × b . (14.73) R: b→R 2 2 2

Here the cross-product n × b denotes the usual vector product, which is dual to the bivector product n ∧ b, equation (14.22). Suppose that the quaternionic components of the rotor R are {w, x, y, z}, that is, R = eı·n θ/2 = w+ı1 x+ı2 y +ı3 z. Show that the transformation (14.73) is (note that the 3×3 rotation matrix is written to the right of the vector, in accordance with the computer graphics convention that rotations accumulate to the right — opposite to the physics convention; to recover the physics convention, take the transpose):  2 2 2 2  w +x −y −z 2(xy+wz) 2(zx−wy)    . (14.74) R: b1 b2 b2 → b1 b2 b3  2(xy−wz) w2 −x2 +y 2 −z 2 2(yz+wx) 2 2 2 2 2(zx+wy) 2(yz−wx) w −x −y +z

Confirm that the 3 × 3 rotation matrix on the right hand side of the transformation (14.74) is an orthogonal ¯ = 1, so that w2 +x2 +y 2 +z 2 = matrix (its inverse is its transverse) provided that the rotor is unimodular, RR

14.12 Pauli matrices

215

1. As a simple example, show that the transformation (14.74) in the case of a right-handed rotation by angle θ about the 3-axis (the 1–2 plane), where w = cos 2θ and z = sin θ2 , is   cos θ sin θ 0   R: (14.75) b1 b2 b2 → b1 b2 b3  − sin θ cos θ 0  . 0 0 1 ⋄

14.12 Pauli matrices The Pauli matrices σ ≡ σi ≡ {σ1 , σ2 , σ3 } form a vector of 2 × 2 complex (with respect to a quantummechanical imaginary i) matrices whose three components are each traceless (Tr σi = 0), Hermitian (σi† = σi ), and unitary (σi† σi = 1, no implicit summation):       0 1 0 −i 1 0 σ1 ≡ , σ2 ≡ , σ3 ≡ . (14.76) 1 0 i 0 0 −1 The Pauli matrices anticommute with each other σ1 σ2 = −σ1 σ2 = iσ3 ,

σ2 σ3 = −σ3 σ2 = iσ1 ,

σ3 σ1 = −σ1 σ3 = iσ2 .

(14.77)

The particular choice (14.76) of Pauli matrices is conventional but not unique: any three traceless, Hermitian, unitary, anticommuting 2 × 2 complex matrices will do. The product of the 3 Pauli matrices is i times the unit matrix,   1 0 σ1 σ2 σ3 = i . (14.78) 0 1 The multiplication rules of the Pauli matrices σi are identical to those of the basis vectors γi of the 3D geometric algebra. If the scalar 1 in the geometric algebra is identified with the unit 2 × 2 matrix, and the pseudoscalar i3 is identified with the imaginary i times the unit matrix, then the 3D geometric algebra is isomorphic to the algebra generated by the Pauli matrices, the Pauli algebra, through the mapping     1 0 1 0 1↔ , γi ↔ σi , i3 ↔ i . (14.79) 0 1 0 1 The 3D pseudoscalar i3 commutes with all elements of the 3D geometric algebra. Concept question 14.5 The Pauli matrices are traceless, Hermitian, unitary, and anticommuting. What do these properties correspond to in the geometric algebra? Are all these properties necessary for the Pauli algebra to be isomorphic to the 3D geometric algebra? Are the properties sufficient? The rotation group is the group of even, unimodular multivectors of the geometric algebra. The isomorphism (14.79) establishes that the rotation group is isomorphic to the group of complex 2 × 2 matrices of

216



The geometric algebra

the form a + iσ · b ,

(14.80)

with a, bi (i = 1, 3) real, and with the unimodular condition requiring that a2 + b·b = 1. It is straightforward to check (Exercise 14.6) that the group of such matrices constitutes the group of unitary complex 2 × 2 matrices of unit determinant, the special unitary group SU (2). The isomorphisms a + i3 γi bi ↔ a + ıi bi ↔ a + iσi bi

(14.81)

have thus established isomorphisms between the group of 3D rotors, the group of unit quaternions, and the special unitary group of complex 2 × 2 matrices 3D rotors ↔ unit quaternions ↔ SU (2) .

(14.82)

An isomorphism that maps a group into a set of matrices, such that group multiplication corresponds to ordinary matrix multiplication, is called a representation of the group. The representation of the rotation group as 2 × 2 complex matrices may be termed the Pauli representation. The Pauli representation is the lowest dimensional representation of the 3D rotation group. Exercise 14.6 Translate a rotor into an element of SU (2). Show that the rotor R = ei3 n θ corresponding to a right-handed rotation by angle θ about unit axis n ≡ {n1 , n2 , n3 } is equivalent to the special unitary 2 × 2 matrix   cos θ2 + in3 sin 2θ (n2 + in1 ) sin θ2 R↔ . (14.83) (−n2 + in1 ) sin 2θ cos θ2 − in3 sin θ2 ¯ is equivalent to the Hermitian conjugate R† of the corresponding 2 × 2 matrix. Show that the reverse rotor R ¯ which is 1. Show that the determinant of the matrix equals RR, ⋄

14.13 Pauli spinors In the Pauli representation, spin- 12 objects ϕ are Pauli spinors, 2-dimensional complex (with respect to i) vectors   ϕ↑ (14.84) ϕ= ϕ↓ that are rotated by pre-multiplying by elements of the special unitary group SU (2). According to the equivalence (14.83), a rotation by 2π is represented by minus the unit matrix,   −1 0 . (14.85) 0 −1 Consequently a rotation by 2π changes the sign of a Pauli spinor ϕ. A rotation by 4π is required to rotate a Pauli spinor back to its original value. Thus a Pauli spinor indeed behaves like a spin- 21 object.

14.13 Pauli spinors

217

In quantum mechanics, the Pauli matrices σi , equations (14.76), provide a representation of the spin operator s ≡ si ≡ {s1 , s2 , s3 } (equation (14.86) is in units ~ = 1; in standard units, s = ~2 σ) s=

1 2

σ.

(14.86)

The eigenvectors of the spin operator s · ζ projected along any axis ζ define objects of definite spin ± 21 measured along that axis. Each Pauli matrix σi has two eigenvalues ±1, and thus each spin operator component si has two eigenvalues ± 21 . The eigenvectors of si are spin-up (eigenvalue + 21 ) and spin-down (eigenvalue − 21 ), as measured along the i-axis. In particular, the normalized eigenvectors of σ3 are ↑≡



1 0



,

↓≡



0 1



,

(14.87)

satisfying s3 ↑ =

1 2

s3 ↓ = − 12 ↓ .

↑,

(14.88)

The Pauli spinor ϕ, equation (14.84), can thus be expressed ϕ = ϕ↑ ↑ + ϕ↓ ↓

(14.89)

where ϕ↑ and ϕ↓ are the complex amplitudes along the up and down directions of the 3-axis (the z-axis). Essential to quantum mechanics is the existence of an inner product. The inner product of two Pauli spinors ϕ and ψ is the product ϕ† ψ of the Hermitian conjugate of ϕ with ψ. The Hermitian conjugate ϕ† of the Pauli spinor ϕ (14.84) is  . (14.90) ϕ† = ϕ∗↑ ϕ∗↓ The magnitude squared of the spinor ϕ is the real number

|ϕ|2 = ϕ† ϕ = |ϕ↑ |2 + |ϕ↓ |2 .

(14.91)

In quantum mechanics, the magnitude squared of the spinor |ϕ|2 is interpreted as the total probability (or probability density) of the particle. The two parts |ϕ↑ |2 and |ϕ↓ |2 are the probabilities of the particle being in the up and down states. The probabilities in the up and down states depend on the direction along which the spin is measured, but the total probability |ϕ|2 is independent on the choice of direction. Exercise 14.7 Orthonormal eigenvectors of the spin operator. Show that the orthonormal eigenvectors ↑ζ and ↓ζ of the spin operator s · ζ projected along the unit direction ζ ≡ {ζ1 , ζ2 , ζ3 } are 1 ↑ζ = p 2(1 + ζ3 )



1 + ζ3 ζ1 + iζ2



,

1 ↓ζ = p 2(1 − ζ3 )



−1 + ζ3 ζ1 + iζ2



.

(14.92) ⋄



218

The geometric algebra

14.14 Pauli spinors as scaled 3D rotors, or quaternions Rotors and Pauli spinors both behave like spin- 21 objects, requiring a rotation of 4π to bring them full circle. ¯ scaled by a positive real scalar λ. In fact a Pauli spinor is equivalent (14.96) to a (reverse) 3D rotor R Consequently a Pauli spinor is equivalent (14.97) to a quaternion. The 2 complex degrees of freedom of the Pauli spinor are equivalent to the 4 real degrees of freedom of the quaternion. Start with the eigenequation (14.88) for the unit spin-up eigenvector ↑ in the 3-direction (z-direction), s3 ↑ =

1 2

↑.

(14.93)

¯ then the If the spin operator s3 is rotated by rotor R, and the spin-up eigenvector ↑ is pre-multiplied by R, ¯ ↑: eigenequation (14.93) transforms into an eigenequation for the unit Pauli spinor R ¯ 3 R) (R ¯ ↑) = 1 (R ¯ ↑) (Rs 2

(14.94)

(notice that the isomorphism (14.79) between the geometric algebra and the Pauli algebra guarantees that ¯ 3 R for rotating the spin operator s3 is valid regardless of whether s3 and R are considered the rule s3 → Rs as elements of the geometric algebra or as 2 × 2 complex matrices in the Pauli algebra). The Pauli spinor ¯ ↑ in equation (14.94) ↑, equation (14.87), is normalized to unit magnitude, and the rotated Pauli spinor R is likewise of unit magnitude. A general Pauli spinor ϕ is the product of a real scalar λ and a rotated unit ¯ ↑, spinor R ¯↑ . ϕ = λR (14.95) The real scalar λ can be taken without loss of generality to be positive, since any minus sign can be absorbed ¯ It is straightforward to check (Exercise 14.8) that any Pauli spinor ϕ into a rotation by 2π of the rotor R. can be expressed in the form (14.95). Equation (14.95) establishes an equivalence between Pauli spinors and ¯ scaled by a positive real scalar λ reversed 3D rotors R ¯. ϕ ↔ λR

(14.96)

Given the equivalence (14.82) between 3D rotors and unit quaternions, it follows that Pauli spinors are equivalent to reverse quaternions (see Exercises 14.8 and 14.9 for the precise translation) Pauli spinors ↔ reverse quaternions .

(14.97)

The equivalence means that there is a one-to-one correspondence between Pauli spinors and reverse quaternions, and that they transform in the same way under 3D rotations. The Hermitian conjugate ϕ† of the Pauli spinor is ϕ† = ↑† λR , †

where ↑ = 1 Pauli spinor

0



(14.98)

is the Hermitian conjugate of the spin-up eigenvector ↑. The squared amplitude of the ϕ† ϕ = λ2

is the probability (or probability density) of the particle, which is unchanged by a rotation.

(14.99)

14.15 Spacetime algebra

219

One is used to thinking of a Pauli spinor as an instrinsically quantum mechanical object. The equivalence (14.96) between Pauli spinors and scaled reverse rotors shows that Pauli spinors also have a classical ¯ This provides a mathematical basis for interpretation: they encode a real amplitude λ, and a rotation R. the idea that, through their spin, fundamental particles “know” about the rotational structure of space. The spin axis ζˆ of a Pauli spinor χ is the direction along which the Pauli spinor is pure up. For example, the spin axis of the of the spin-up eigenvector ↑ is the positive 3-axis (the z-axis), while the spin axis of the of the spin-down eigenvector ↓ is the negative 3-axis. In general, the spin axis of a Pauli spinor (14.95) is the unit direction ζˆ of the rotated 3-axis, ¯ 3 R = s · ζˆ . Rs (14.100) The rotor S corresponding to a right-handed rotation by angle ζ about the spin axis ζˆ is S = ei3 ζ/2 where ˆ ¯ → S¯R, ¯ hence transforms the Pauli spinor (14.95) as ϕ → Sϕ. ¯ ζ = ζζ. Such a rotation transforms R Rotating the Pauli spinor ϕ right-handedly by angle ζ about its spin axis leaves the spin axis unchanged, but multiplies the spinor by a phase e−iζ/2 , e−i3 ζ/2 ϕ = e−iζ/2 ϕ .

(14.101)   ϕ↑ , Exercise 14.8 Translate a Pauli spinor into a quaternion. Given any Pauli spinor ϕ ≡ ϕ↓ ¯ in the Pauli representation (14.76) is the unitary 2 × 2 show that the corresponding scaled reverse rotor λR matrix   ϕ↑ −ϕ∗↓ ¯ λR = . (14.102) ϕ↓ ϕ∗↑ Show that the corresponding real quaternion is ¯ = {Re ϕ↑ , Im ϕ↓ , −Re ϕ↓ , Im ϕ↑ } . q¯ = λR

(14.103) ⋄

Exercise 14.9 Translate a quaternion into a Pauli spinor. Show that if the rotor R corresponds to a right-handed rotation by angle θ about unit axis n ≡ {n1 , n2 , n3 }, then the corresponding scaled Pauli ¯ ↑ is, from (14.83), spinor ϕ ≡ λR   cos 2θ − in3 sin θ2 ¯ . (14.104) ϕ ≡ λR ↑ = λ (n2 − in1 ) sin 2θ ⋄

14.15 Spacetime algebra So far this chapter has concerned itself with ordinary n-dimensional Euclidean space, in which the length squared of a vector is the sum of the squares of its components. In special relativity, however, the scalar

220



The geometric algebra

length s of a spacetime interval {t, x, y, x} is given by s2 = −t2 + x2 + y 2 + z 2 . Happily, all the results of previous sections hold with scarcely a change of stride. Let γm (m = 0, 1, 2, 3) denote an orthonormal basis of spacetime, with γ0 representing the time axis, and γi (i = 1, 2, 3) the spatial axes. Geometric multiplication in the spacetime algebra is defined by γm γn = γm · γn + γm ∧ γn

(14.105)

in the usual way. The key difference between the spacetime basis γm and Euclidean bases is that scalar products of the basis vectors γm form the Minkowski metric ηmn , γm · γn = ηmn

(14.106)

whereas scalar products of Euclidean basis elements γi formed the unit matrix, γi · γj = δij , equation (14.6). In less abbreviated form, equations (14.105) state that the geometric product of each basis element with itself is −γ γ02 = γ12 = γ22 = γ32 = 1 ,

(14.107)

while geometric products of different basis elements γm anticommute γm γn = −γ γn γ m = γ m ∧ γ n

(m 6= n) .

(14.108)

In the Dirac theory of relativistic spin- 12 particles, the Dirac γ-matrices are required to satisfy {γ γm , γn } = 2 ηmn

(14.109)

where {} denotes the anticommutator, {γ γm , γn } ≡ γm γn + γn γm . The multiplication rules (14.109) for the Dirac γ-matrices are the same as those for geometric multiplication in the spacetime algebra, equations (14.107) and (14.108). A 4-vector a, a multivector of grade 1 in the geometric algebra of spacetime, is a = γ m am = γ 0 a0 + γ 1 a1 + γ 2 a2 + γ 3 a3 .

(14.110)

Such a 4-vector a would be denoted 6 a in the Dirac slash notation. The product of two 4-vectors a and b is γm , γ n ] . ab = a · b + a ∧ b = am bn γm · γn + am bn γm ∧ γn = am bn ηmn + 21 am bn [γ

(14.111)

It is convenient to denote three of the six bivectors of the spacetime algebra by σi , σi ≡ γ 0 γ i

(i = 1, 2, 3) .

(14.112)

The symbol σi is used because the algebra of bivectors σi is isomorphic to the algebra of Pauli matrices σi . The triple of bivectors σi will often be denoted shorthandedly by the symbol σ σ ≡ {σ1 , σ2 , σ3 } .

(14.113)

The pseudoscalar, the highest grade basis element of the spacetime algebra, is denoted I γ 0 γ 1 γ 2 γ 3 = σ1 σ2 σ3 = I .

(14.114)

14.16 Complex quaternions

221

The pseudoscalar I satisfies I 2 = −1 ,

Iγ γm = −γ γm I ,

Iσi = σi I .

(14.115)

The basis elements of the 4-dimensional spacetime algebra are then 1, 1 scalar

γm , 4 vectors

σi , Iσi , 6 bivectors

Iγ γm , 4 pseudovectors

I, 1 pseudoscalar

(14.116)

forming a linear space of dimension 1 + 4 + 6 + 4 + 1 = 16 = 24 . The reverse is defined in the usual way, equation (14.12), leaving unchanged multivectors of grade 0 or 1, modulo 4, and changing the sign of multivectors of grade 2 or 3, modulo 4: ¯1 = 1 ,

¯ m = γm , γ

¯ i = −σi , σ

Iσ i = −Iσi ,

Iγ γ m = −Iγ γm ,

I¯ = I .

(14.117)

The mapping (3)

γi

↔ σi

(i = 1, 2, 3)

(14.118)

(the superscript (3) distinguishes the 3D basis vectors from the 4D spacetime basis vectors) defines an isomorphism between the 8-dimensional geometric algebra (14.3) of 3 spatial dimensions and the 8-dimensional even spacetime subalgebra. Among other things, the isomorphism (14.118) implies the equivalence of the 3D spatial pseudoscalar i3 and the 4D spacetime pseudoscalar I i3 ↔ I

(14.119)

since i3 = γ1 γ2 γ3 and I = σ1 σ2 σ3 .

14.16 Complex quaternions A complex quaternion (also called a biquaternion by W. R. Hamilton) is a quaternion q = a + ıi b i = a + ı · b

(14.120)

in which the four coefficients a, bi (i = 1, 2, 3) are each complex numbers a = aR + IaI ,

bi = bi,R + Ibi,I .

(14.121)

The imaginary I is taken to commute with each of the quaternionic imaginaries ıi . The choice of symbol I is deliberate: in the isomorphism (14.133) between the even spacetime algebra and complex quaternions, the commuting imaginary I is isomorphic to the spacetime pseudoscalar I. All of the equations in §14.10 on real quaternions remain valid without change, including the multiplication, conjugation, and inversion formulae (14.57)–(14.62). In the quaternionic conjugate q¯ of a complex quaternion q ≡ a + ı · b, q¯ = a − ı · b ,

(14.122)



222

The geometric algebra

the complex coefficients a and b are not conjugated with respect to the complex imaginary I. The magnitude |q| of a complex quaternion q ≡ a + ı · b, |q| = (¯ q q)1/2 = (q q¯)1/2 = (a2 + b · b)1/2 = (a2 + b21 + b22 + b23 )1/2 ,

(14.123)

is a complex number, not a real number. The complex conjugate q ∗ of the complex quaternion is q ∗ = a∗ + ı · b∗ ,

(14.124)

in which the complex coefficients a and b are conjugated with respect to the imaginary I, but the quaternionic imaginaries ı are not conjugated. A non-zero complex quaternion can have zero magnitude (unlike a real quaternion), in which case it is null. The null condition q¯q = a2 + b21 + b22 + b23 = 0 is a complex condition. The product of two null complex quaternions is a null quaternion. Under multiplication, null quaternions form a 6-dimensional subsemigroup (not a subgroup, because null quaternions do not have inverses) of the 8-dimensional semigroup of complex quaternions. Exercise 14.10

Show that any non-trivial null complex quaternion q can be written uniquely in the form q = (1 + Iı · n)p ,

(14.125)

where p is a real quaternion, and n is a unit real 3-vector. Equivalently, q = p(1 + Iı · n′ ) ,

(14.126)

where n′ is the unit real 3-vector p¯np . |p|2

(14.127)

q = p + Ir

(14.128)

n′ = Solution. Write the null quaternion q as

where p and r are real quaternions, both of which must be non-zero if q is non-trivial. Then equation (14.125) is true with rp¯ (14.129) ı·n = 2 . |p| 2

2

The null condition is q q¯ = 0. The vanishing of the real part, Re (q q¯) = p¯ p − r¯ r = 0, shows that |p| = |r| . The vanishing of the imaginary (I) part, Im (q q¯) = rp¯ + p¯ r = rp¯ + rp¯ = 0 shows that the rp¯ must be a pure quaternionic imaginary, since the quaternionic conjugate of rp¯ is minus itself, so rp¯/ |p|2 must be of the form ı · n. Its squared magnitude ı · n ı · n = rp¯ p¯ r / |p|4 = |r|2 / |p|2 = 1 is unity, so n must be a unit 3-vector. It follows immediately from the manner of construction that the expression (14.125) is unique, as long as q is non-trivial. ⋄

14.17 Lorentz transformations and complex quaternions

223

14.17 Lorentz transformations and complex quaternions Lorentz transformations are rotations of spacetime. Such rotations correspond, in the usual way, to even, unimodular elements of the geometric algebra of spacetime. The basis elements of the even spacetime algebra are 1, σi , Iσi , I, (14.130) 1 scalar 6 bivectors 1 pseudoscalar forming a linear space of dimension 1 + 6 + 1 = 8 over the real numbers. However, it is more elegant to treat the even spacetime algebra as a linear space of dimension 8 ÷ 2 = 4 over complex scalars of the form λ = λR + IλI . The pseudoscalar I qualifies as a scalar because it commutes with all elements of the even spacetime algebra, and it qualifies as an imaginary because I 2 = −1. It is convenient to take the basis elements of the even spacetime algebra over the complex numbers to be 1, 1 scalar

Iσi , 3 bivectors

(14.131)

forming a linear space of dimension 1 + 3 = 4. The reason for choosing Iσi rather than σi as the elements of the basis (14.131) is that the basis is {1, Iσi } is equivalent to the basis (14.63) of the even algebra of 3-dimensional Euclidean space through the isomorphism (14.118) and (14.119). This basis in turn is equivalent to the quaternionic basis {1, ıi } through the isomorphism (14.68): (3)

Iσi ↔ i3 γi

↔ ıi

(i = 1, 2, 3) .

(14.132)

In other words, the even spacetime algebra is isomorphic to the algebra of quaternions with complex coefficients: a + Iσ · b ↔ a + ı · b

(14.133)

where a = aR + IaI is a complex number, b = bR + IbI , is a triple of complex numbers, σ is the triple of bivectors σi , and ı is the triple of quaternionic imaginaries. The isomorphism (14.133) between even elements of the spacetime algebra and complex quaternions implies that the group of Lorentz rotors, which are unimodular elements of the even spacetime algebra, is isomorphic to the group of unimodular complex quaternions spacetime rotors ↔ unit complex quaternions .

(14.134)

In §14.11 it was found that the group of 3D spatial rotors is isomorphic to the group of unimodular real quaternions. Thus Lorentz transformations are mathematically equivalent to complexified spatial rotations. A Lorentz rotor can be written as a complex quaternion in what looks like the same form as the expression (14.70) for a 3D spatial rotor, with the difference that the rotation angle θ is complex, and the axis n of rotation is likewise complex. Thus R = eθ/2 = eı·n θ/2 = cos

θ θ + ı · n sin 2 2

(14.135)



224

The geometric algebra

where θ is the bivector complex quaternion θ ≡ ı · n θ ≡ (ı1 n1 + ı2 n2 + ı3 n3 ) θ

(14.136)

¯ 1/2 = θ and whose complex unit direction is θˆ ≡ θ/θ ≡ ı·n. The angle whose complex magnitude is |θ| ≡ (θθ) θ = θR + IθI is a complex angle, and n = nR + InI is a complex-valued unit 3-vector, satisfying n · n = 1. The condition n · n = 1 of unit normalization is equivalent to the two conditions nR · nR − nI · nI = 1 and 2nR · nI = 0 on the real and imaginary parts of n · n. The complex angle θ has 2 degrees of freedom, while the complex unit vector has 4 degrees of freedom, so the Lorentz rotor R has 6 degrees of freedom, which is the correct number of degrees of freedom of the group of Lorentz transformations. The associated reverse ¯ is rotor R ¯ = e−θ/2 = e−ı·n θ/2 = cos θ − ı · n sin θ (14.137) R 2 2 the quaternionic conjugate of R. Note that θ and n in equation (14.137) are not conjugated with respect to the imaginary I. The sine and cosine of the complex angle θ appearing in equations (14.135) and (14.137) are related to its real and imaginary parts in the usual way, θR θI θR θI θ θR θI θR θI θ = cos cosh − I sin sinh , sin = sin cosh + I cos sinh . (14.138) 2 2 2 2 2 2 2 2 2 2 In the case of a pure spatial rotation, the angle θ = θR and axis n = nR in the rotor (14.135) are both real. The rotor corresponding to a pure spatial rotation by angle θR right-handedly about unit real axis nR is θR θR R = eı·nR θR /2 = cos + ı · nR sin . (14.139) 2 2 A Lorentz boost is a change of velocity in some direction, without any spatial rotation, and represents a rotation of spacetime about some time-space plane. For example, a Lorentz boost along the 1-axis (the x-axis) is a rotation of spacetime in the 0–1 plane (the t–x plane). In the case of a pure Lorentz boost, the angle θ = IθI is pure imaginary, but the axis n = nR remains pure real. The rotor corresponding to a boost by velocity v = tanh θI in unit real direction nR is cos

R = eı·nR IθI /2 = cosh

θI θI + ı · nR I sinh . 2 2

(14.140)

Exercise 14.11 Factor a general Lorentz rotor R = eı·n θ/2 into the product U L of a pure spatial rotation U followed by a pure Lorentz boost L. Do the two factors commute? ⋄ Exercise 14.12 Show that the geometry of the group of Lorentz rotors is the product of the geometries of the spatial rotation group and the boost group, which is a 3-sphere times Euclidean 3-space, S 3 × R3 . ⋄

14.18 Spatial Inversion (P ) and Time Inversion (T ) Spatial inversion, or P for parity, is the operation of reflecting all spatial coordinates while keeping the time coordinate unchanged. Spatial inversion may be accomplished by reflecting the spatial vector basis elements

14.19 Electromagnetic field bivector

225

γi → −γ γi , while keeping the time vector basis element γ0 unchanged. This results in σ → −σ and I → −I. The equivalence Iσ ↔ ı means that the quaternionic imaginary ı is unchanged. Thus, if multivectors in the geometric spacetime algebra are written as linear combinations of products of γ0 , ı, and I, then spatial inversion P corresponds to the transformation P : γ0 → γ0 ,

ı→ı,

I → −I .

(14.141)

In other words spatial inversion may be accomplished by the rule, take the complex conjugate (with respect to I) of a multivector. Time inversion, or T , is the operation of reversing time while keeping all spatial coordinates unchanged. Time inversion may be accomplished by reflecting the time vector basis element γ0 → −γ γ0 , while keeping the spatial vector basis elements γi unchanged. As with spatial inversion, this results in σ → −σ and I → −I, which keeps Iσ hence ı unchanged. If multivectors in the geometric spacetime algebra are written as linear combinations of products of γ0 , ı, and I, then time inversion T corresponds to the transformation T : γ0 → −γ γ0 ,

ı→ı,

I → −I .

(14.142)

For any multivector, time inversion corresponds to the instruction to flip γ0 and take the complex conjugate (with respect to I). The combined operation P T of inverting both space and time corresponds to P T : γ0 → −γ γ0 ,

ı→ı,

I→I .

(14.143)

For any multivector, spacetime inversion corresponds to the instruction to flip γ0 , while keeping ı and I unchanged.

14.19 Electromagnetic field bivector The electromagnetic field tensor F mn can be expressed as the bivector F = 12 F mn γm ∧ γn ,

(14.144)

the factor of 12 compensating for the double-counting over indices m and n (the 21 could be omitted if the counting were over distinct bivector indices only). In terms of the electric and magnetic fields E and B, and the bivector basis elements σ ≡ {σ1 , σ2 , σ3 } defined by equation (14.112), the electromagnetic field bivector F is F = −σ · (E + IB) .

(14.145)

14.20 How to implement Lorentz transformations on a computer The advantages of quaternions for implementing spatial rotations are well-known to 3D game programmers. Compared to standard rotation matrices, quaternions offer increased speed and require less storage, and



226

The geometric algebra

their algebraic properties simplify interpolation and splining. Complex quaternions retain similar advantages for implementing Lorentz transformations. They are fast, compact, and straightforward to interpolate or spline (Exercises 14.13 and 14.14). Moreover, since complex quaternions contain real quaternions, Lorentz transformations can be implemented simply as an extension of spatial rotations in 3D programs that use quaternions to implement spatial rotations. Lorentz rotors, 4-vectors, spacetime bivectors, and spinors (spin- 21 objects) can all be implemented as complex quaternions. A complex quaternion q = w + ı1 x + ı2 y + ı3 z

(14.146)

with complex coefficients w, x, y, z can be stored as the 8-component object   wR xR yR zR . q= wI xI yI zI

(14.147)

Actually, OpenGL and other computer software store the scalar (w) component of a quaternion in the last (fourth) place, but here the scalar components are put in the zeroth position to conform to standard physics convention. The quaternion conjugate q¯ of the quaternion (14.147) is   wR −xR −yR −zR , (14.148) q¯ = wI −xI −yI −zI while its complex conjugate q ∗ is q∗ =



wR −wI

xR −xI

yR −yI

zR −zI



.

(14.149)

¯ = A Lorentz rotor R corresponds to a complex quaternion of unit modulus. The unimodular condition RR 1, a complex condition, removes 2 degrees of freedom from the 8 degrees of freedom of complex quaternions, leaving the Lorentz group with 6 degrees of freedom, which is as it should be. Spatial rotations correspond to real unimodular quaternions, and account for 3 of the 6 degrees of freedom of Lorentz transformations. A spatial rotation by angle θ right-handedly about the 1-axis (the x-axis) is the real Lorentz rotor R = cos(θ/2) + ı1 sin(θ/2) ,

(14.150)

or, stored as a complex quaternion, R=



cos(θ/2) sin(θ/2) 0 0 0 0

0 0



.

(14.151)

Lorentz boosts account for the remaining 3 of the 6 degrees of freedom of Lorentz transformations. A Lorentz boost by velocity v, or equivalently by boost angle θ = atanh(v), along the 1-axis (the x-axis) is the complex Lorentz rotor R = cosh(θ/2) + Iı1 sinh(θ/2) ,

(14.152)

14.20 How to implement Lorentz transformations on a computer

227

or, stored as a complex quaternion, R=



cosh(θ/2) 0 0 0 sinh(θ/2) 0

0 0



.

(14.153)

The rule for composing Lorentz transformations is simple: a Lorentz transformation R followed by a Lorentz transformation S is just the product RS of the corresponding complex quaternions. ¯ The inverse of a Lorentz rotor R is its quaternionic conjugate R. Any even multivector q is equivalent to a complex quaternion by the isomorphism (14.133). According to the usual transformation law (14.35) for multivectors, the rule for Lorentz transforming an even multivector q is ¯ R : q → RqR

(even multivector) .

(14.154)

¯ q, and R, a one-line expresThe transformation (14.154) instructs to multiply three complex quaternions R, sion in a c++ program. As an example of an even multivector, the electromagnetic field F , equation (14.145), is a bivector in the spacetime algebra. In view of the isomorphism (14.133), the electromagnetic field bivector F can be written as the complex quaternion   0 −B1 −B2 −B3 . (14.155) F = 0 E1 E2 E3 Under the parity transformation P (14.141), the electric field E changes sign, whereas the magnetic field B does not, which is as it should be: P : E → −E ,

B→B .

(14.156)

¯ R, which According to the rule (14.154), the electromagnetic field bivector F Lorentz transforms as F → RF is a powerful and elegant way to Lorentz transform the electromagnetic field. A 4-vector a ≡ γm am is a multivector of grade 1 in the spacetime algebra. A general odd multivector in the spacetime algebra is the sum of a vector (grade 1) part a and a pseudovector (grade 3) part Ib = Iγ γm b m . The odd multivector can be written as the product of the time basis vector γ0 and an even multivector q  (14.157) a + Ib = γ0 q = γ0 a0 + Iıi ai − Ib0 + ıi bi . By the isomorphism (14.133), the even multivector q is equivalent to the complex quaternion  0  a b1 b2 b3 q= . −b0 a1 a2 a3

(14.158)

According to the usual transformation law (14.35) for multivectors, the rule for Lorentz transforming the odd multivector γ0 q is ¯ ∗ qR . ¯ γ0 qR = γ0 R (14.159) R : γ0 q → Rγ In the last expression of (14.159), the factor γ0 has been brought to the left, to be consistent with the convention (14.157) that an odd multivector is γ0 on the left times an even multivector on the right. Notice

228



The geometric algebra

¯ converts the latter to its complex conjugate (with respect to I) R ¯ ∗ , which that commuting γ0 through R is true because γ0 commutes with the quaternionic imaginary ı, but anticommutes with the pseudoscalar I. Thus if the components of an odd multivector are stored as a complex quaternion (14.158), then that complex quaternion q Lorentz transforms as ¯ ∗ qR R: q→R

(odd multivector) .

(14.160)

¯ ∗ , q, and R, a one-line expression in The rule (14.160) again instructs to multiply three complex quaternions R a c++ program. The transformation rule (14.160) for an odd multivector encoded as a complex quaternion ¯ is complex conjugated (with differs from that (14.154) for an even multivector in that the first factor R respect to I). A vector a differs from a pseudovector Ib in that the vector a changes sign under a parity transformation P whereas the pseudovector Ib does not. However, the behaviour of a pseudovector under a normal Lorentz transformation (which preserves parity) is identical to that of a vector. Thus in practical situations two 4-vectors a and b can be encoded into a single complex quaternion (14.158), and Lorentz transformed simultaneously, enabling two transformations to be done for the price of one. Finally, a Dirac spinor is equivalent to a complex quaternion q (§14.23). It Lorentz transforms as ¯ R : q → Rq

(spinor) .

(14.161)

Exercise 14.13 Interpolate a Lorentz transformation. Argue that the interpolating Lorentz rotor R(x) that corresponds to uniform rotation and acceleration between initial and final Lorentz rotors R0 and R1 as the parameter x varies uniformly from 0 to 1 is R(x) = R0 exp [x ln(R1 /R0 )] .

(14.162)

What are the exponential and logarithm of a complex quaternion in terms of its components? Address the issue of the multi-valued character of the logarithm. ⋄ Exercise 14.14 Spline a Lorentz transformation. A spline is a polynomial that interpolates between two points with given values and derivatives at the two points. Confirm that the cubic spline of a real function f (x) with given initial and final values f0 and f1 and given initial and final derivatives f0′ and f1′ at x = 0 and x = 1 is f (x) = f0 + f0′ x + [3(f1 − f0 ) − 2f0′ − f1′ ] x2 + [2(f0 − f1 ) + f0′ + f1′ ] x3 .

(14.163)

The case in which the derivatives at the endpoints are set to zero, f0′ = f1′ = 0, is called the “natural” spline. Argue that a Lorentz rotor can be splined by splining the quaternionic components of the logarithm of the Lorentz rotor. ⋄ Exercise 14.15 The wrong way to implement a Lorentz transformation. The purpose of this exercise is to persuade you that Lorentz transforming a 4-vector by the rule (14.160) is a much better idea than Lorentz transforming by multiplying by an explicit 4 × 4 matrix. Suppose that the Lorentz rotor R is

14.21 Dirac matrices

229

the complex quaternion R=



wR wI

xR xI

yR yI

zR zI



.

(14.164)

Show that the Lorentz transformation (14.160) transforms the 4-vector components am = {a0 , a1 , a2 , a3 } as (note that the 4 × 4 rotation matrix is written to the right of the 4-vector in accordance with the computer graphics convention that rotations accumulate to the right — opposite to the physics convention; to recover the physics convention, take the transpose): R:

a0

a1

a2 a3 





2

a0

a1 2

a2 2

a3 2



|w| + |x| + |y| + |z|  2 (wR xI − wI xR − yR zI + yI zR )   2 (wR yI − wI yR − zR xI + zI xR ) 2 (wR zI − wI zR − xR yI + xI yR )

2 (wR xI − wI xR + yR zI − yI zR ) |w|2 + |x|2 − |y|2 − |z|2 2 (xR yR + xI yI − wR zR − wI zI ) 2 (zR xR + zI xI + wR yR + wI yI )

 2 (wR yI − wI yR + zR xI − zI xR ) 2 (wR zI − wI zR + xR yI − xI yR ) 2 (xR yR + xI yI + wR zR + wI zI ) 2 (zR xR + zI xI − wR yR − wI yI )   , (14.165) 2 2 2 2 |w| − |x| + |y| − |z| 2 (yR zR + yI zI + wR xR + wI xI )  2 (yR zR + yI zI − wR xR − wI xI ) |w|2 − |x|2 − |y|2 + |z|2 2

2 where | | signifies the absolute value of a complex number, as in |w| = wR + wI2 . As a simple example, show that the transformation (14.165) in the case of a Lorentz boost by velocity v along the 1-axis, where the rotor R takes the form (14.153), is   γ γv 0 0    γv γ 0 0   , R: (14.166) a0 a1 a2 a3 → a0 a1 a2 a3   0 0 1 0 

0

0

0

1

with γ the familiar Lorentz gamma factor γ = cosh θ =

1 , (1 − v 2 )1/2

γv = sinh θ =

v . (1 − v 2 )1/2

(14.167) ⋄

14.21 Dirac matrices The multiplication rules (14.105) for the basis vectors γm of the spacetime algebra are identical to the rules (14.109) governing the Clifford algebra of the Dirac γ-matrices used in the Dirac theory of relativistic spin- 21 particles. The Dirac γ-matrices are conventionally represented by 4 × 4 complex matrices. To ensure consistency

230



The geometric algebra

between the relativistic and quantum mechanical ways of taking the scalar (inner) product, it is desirable to require that taking the Hermitian conjugate of any of the basis vectors γm be equivalent to raising its index, † γm = γm .

(14.168)

Given the requirement (14.168), the matrices representing the basis vectors γm must be traceless (because a trace is a scalar, and the basis vectors cannot contain any scalar part), Hermitian or anti-Hermitian as † the self-product of the matrix is ±1 (so that γm = γ m = η mn γn ), and unitary and anticommuting (so that † m m γm · γn = γ · γn = δn ). The precise choice of matrices is not fundamental: any set of 4 matrices satisfying these conditions will do. The high-energy physics community conventionally adopts the +−−− metric signature, which is opposite to the convention adopted here. With the high-energy +−−− signature, the standard convention for the Dirac γ-matrices is     1 0 0 σi γ0 = , γi = , (14.169) 0 −1 −σi 0 where 1 denotes the unit 2 × 2 matrix, and σi denote the three 2 × 2 Pauli matrices (14.76). The choice of γ0 as a diagonal matrix is motivated by Dirac’s discovery that eigenvectors of the time basis vector γ0 with eigenvalues of opposite sign define particles and antiparticles in their rest frames (see §14.22). To convert to the −+++ metric signature adopted here while retaining the conventional set of eigenvectors, an additional factor of i must be inserted into the γ-matrices:     1 0 0 σi γ0 = i , γi = i , (14.170) 0 −1 −σi 0 In the representation of equations (14.169) or (14.170), the bivectors σi and Iσi and the pseudoscalar I of the spacetime algebra are       0 −σi σi 0 0 1 σi = , Iσi = i , I = −i , (14.171) −σi 0 0 σi 1 0 whose representation as matrices is the same for either signature −+++ or +−−−. The Hermitian conjugates of the bivector and pseudoscalar basis elements are σi† = σi ,

(Iσi )† = −Iσi ,

I † = −I .

The conventional chiral matrix γ5 of Dirac theory is defined by   0 1 γ5 ≡ iγ γ0 γ1 γ2 γ3 = iI = , 1 0

(14.172)

(14.173)

whose representation is again the same for either signature −+++ or +−−−. The chiral matrix is Hermitian γ5† = γ5 .

(14.174)

14.22 Dirac spinors

231

14.22 Dirac spinors In the Dirac theory of relativistic spin- 21 particles, a Dirac spinor ϕ is represented as a 2-component column vector of Pauli spinors ϕ⇑ and ϕ⇓ , comprising 4 complex (with respect to i) components and hence 8 degrees of freedom,   ϕ⇑↑    ϕ⇑↓  ϕ⇑  (14.175) = ϕ=  ϕ⇓↑  . ϕ⇓ ϕ⇓↓

The Dirac γ-matrices operate by pre-multiplication on Dirac spinors ϕ, yielding other Dirac spinors. In the Dirac representation (14.170), the four unit Dirac spinors         0 0 0 1  0   0   1   0         (14.176) ⇑↑ =   0  , ⇑↓ =  0  , ⇓↑ =  1  , ⇓↓ =  0  , 1 0 0 0 are eigenvectors of the time basis vector γ0 and of the bivector Iσ3 , with ⇑ and ⇓ denoting eigenvectors of γ0 , and ↑ and ↓ eigenvectors of Iσ3 , γ0 ⇑ = i ⇑ ,

γ0 ⇓ = −i ⇓ ,

Iσ3 ↑ = i ↑ ,

Iσ3 ↓ = −i ↓ .

(14.177)

The bivector Iσ3 is the generator of a spatial rotation about the 3-axis (z-axis), equation (14.132). The four eigenvectors (14.177) form an orthonormal basis A pure spin-up state ↑ can be rotated into a pure spin-down state ↓, or vice versa, by a spatial rotation about the 1-axis or 2-axis. By contrast, a pure time-up state ⇑ cannot be rotated into a pure time-down state ⇓, or vice versa, by any Lorentz transformation. Consider for example trying to rotate the pure time-up spin-up ⇑↑ state into any combination of pure time-down ⇓ states. According to the expression (14.192), the Dirac spinor ϕ obtained by Lorentz transforming the ⇑↑ state is pure ⇓ only if the corresponding complex quaternion q¯ is pure imaginary. But a pure imaginary quaternion has negative squared magnitude q¯q, so cannot be equivalent to any rotor of unit magnitude. Thus the pure time-up and pure time-down states ⇑ and ⇓ are distinct states that cannot be transformed into each other by any Lorentz transformation. The two states represent distinct species, particles and antiparticles. Although a pure time-up state cannot be transformed into a pure time-down state or vice versa by any Lorentz transformation, the time-up and time-down eigenstates ⇑ and ⇓ do mix under Lorentz transformations. The manner in which Dirac spinors transform is described in §14.23. The choice of time-axis γ0 and spin-axis γ3 with respect to which the eigenvectors are defined can of course be adjusted arbitrarily by a Lorentz boost and a spatial rotation. The eigenvectors of a particular time-axis γ0 correspond to particles and antiparticles that are at rest in that frame. The eigenvectors associated with a particular spin-axis γ3 correspond to particles or antiparticles that are pure spin-up or pure spin-down in that frame.

232



The geometric algebra

14.23 Dirac spinors as complex quaternions In §14.14 it was found that a spin- 21 object in 3D space, a Pauli spinor, is isomorphic to a scaled 3D reverse rotor, or real quaternion. In the relativistic theory, the corresponding spin- 12 object, a Dirac spinor ϕ, is isomorphic (14.181) to a complex quaternion. The 4 complex degrees of freedom of the Dirac spinor ϕ are equivalent to the 8 degrees of freedom of a complex quaternion. A physically interesting complication arises in the relativistic case because a non-trivial Dirac spinor can be null, with zero magnitude, whereas any nontrivial Pauli spinor is necessarily non-null. The case of non-null (massive) and null (massless) Dirac spinors are considered respectively in §14.24 and §14.25. The present section establishes an isomorphism (14.181) between Dirac spinors and complex quaternions that is valid in general, regardless of whether the Dirac spinor is null or not. If a is a spacetime multivector, equivalent to an element of the Clifford algebra of Dirac γ-matrices, then under rotation by Lorentz rotor R, the multivector a operating on the Dirac spinor ϕ transforms as ¯ ¯ = Raϕ ¯ R : aϕ → (RaR)( Rϕ) .

(14.178)

This shows that a Dirac spinor ϕ Lorentz transforms, by construction, as ¯ . R : ϕ → Rϕ

(14.179)

The rule (14.179) is precisely the transformation rule for reverse spacetime rotors under Lorentz trans¯ S. ¯ More generally, the formations: under a rotation by rotor R, a reverse rotor S¯ transforms as S¯ → R transformation law (14.179) holds for any linear combination of Dirac spinors ϕ. The isomorphism (14.134) between spacetime rotors and unit quaternions shows that unit Dirac spinors are isomorphic to unit (reverse) complex quaternions. The algebra of linear combinations of unit complex quaternions is just the algebra of complex quaternions. Thus the algebra of Dirac spinors is isomorphic to the algebra of (reverse) complex ¯ the quaternions. Specifically, any Dirac spinor ϕ can be expressed uniquely in the form of a 4 × 4 matrix q, Dirac representation of a reverse complex quaternion q¯, acting on the time-up spin-up eigenvector ⇑↑ (the precise translation between Dirac spinors and complex quaternions is left as Exercises 14.16 and 14.17): ϕ = q¯ ⇑↑ .

(14.180)

In this section (including the Exercises) the 4 × 4 matrix q¯ is written in boldface to distinguish it from the quaternion q¯ that it represents; but the distinction is not fundamental, so the temporary boldface notation is dropped in subsequent sections. The equivalence (14.180) establishes that Dirac spinors are isomorphic to reverse complex quaternions ϕ ↔ q¯ .

(14.181)

The isomorphism means that there is a one-to-one correspondence between Dirac spinors ϕ and reverse complex quaternions q¯, and that they transform in the same way under Lorentz transformations. Notwithstanding the isomorphism (14.181), Dirac spinors differ from complex quaternions in that they have an additional structure that is essential to quantum mechanics, an inner product ϕ†1 ϕ2 of two Dirac

14.23 Dirac spinors as complex quaternions

233

spinors ϕ1 and ϕ2 . The inner product ϕ†1 ϕ2 is a complex (with respect to i) number. The Hermitian conjugate ϕ† of a Dirac spinor ϕ (14.180) is defined to be ϕ† = (⇑↑)† q¯† ,

(14.182)

where (⇑↑)† = 1 0 0 0 is the Hermitian conjugate of the time-up spin-up eigenvector ⇑↑, and q¯† is ¯ A related spinor is the the reverse, or adjoint, spinor ϕ, the Hermitian conjugate of the matrix q. ¯ defined to be 

ϕ¯ = (⇑↑)† q ,

(14.183)

¯ The Hermitian conjugate spinor ϕ† is related to the adjoint spinor where q is the reverse of the matrix q. ϕ by (Exercise 14.18) ϕ† = iϕγ ¯ γ0 .

(14.184)

The product ϕϕ ¯ of the adjoint spinor ϕ¯ with ϕ is a Lorentz-invariant scalar, as follows from the Lorentz invariance of q q¯. On the other hand, the product ϕ† ϕ of the Hermitian conjugate spinor ϕ† with ϕ is the time component of a 4-vector iϕγ ¯ γm ϕ . Exercise 14.16

(14.185)

Translate a Dirac spinor into a complex quaternion. Given any Dirac spinor   ϕ⇑↑  ϕ⇑↓   (14.186) ϕ=  ϕ⇓↑  , ϕ⇓↓

show that the corresponding reverse complex quaternion q¯, and the equivalent 4 × 4 matrix q¯ in the Dirac ¯ representation (14.170), such that ϕ = q⇑↑, are (the complex conjugates ϕ∗a of the components ϕa of the spinor are with respect to the quantum mechanical imaginary i)   ϕ⇑↑ −ϕ∗⇑↓ ϕ⇓↑ ϕ∗⇓↓    ϕ⇑↓ ϕ∗⇑↑ ϕ⇓↓ −ϕ∗⇓↑  Re ϕ⇑↑ Im ϕ⇑↓ −Re ϕ⇑↓ Im ϕ⇑↑  . ↔ q¯ =  q¯ = (14.187)  ϕ⇓↑ ϕ∗ −Im ϕ⇓↑ Re ϕ⇓↓ Im ϕ⇓↓ Re ϕ⇓↑ ϕ⇑↑ −ϕ∗⇑↓  ⇓↓ ϕ⇓↓ −ϕ∗⇓↑ ϕ⇑↓ ϕ∗⇑↑

¯ Show that the complex quaternion q (the reverse of q¯), and the equivalent 4 × 4 matrix q (the reverse of q) in the Dirac representation (14.170), are   ϕ∗⇑↑ ϕ∗⇑↓ −ϕ∗⇓↑ −ϕ∗⇓↓    −ϕ⇑↓ ϕ⇑↑ −ϕ⇓↓ ϕ⇓↑  Re ϕ⇑↑ −Im ϕ⇑↓ Re ϕ⇑↓ −Im ϕ⇑↑  . (14.188) ↔q= q=  −ϕ∗ −ϕ∗ ϕ∗⇑↓  ϕ∗⇑↑ −Im ϕ⇓↑ −Re ϕ⇓↓ −Im ϕ⇓↓ −Re ϕ⇓↑ ⇓↑ ⇓↓ −ϕ⇓↓ ϕ⇓↑ −ϕ⇑↓ ϕ⇑↑

Conclude that the reverse spinor ϕ¯ ≡ (⇑↑)† q is ϕ¯ ≡ (⇑↑)† q =

ϕ∗⇑↑

ϕ∗⇑↓

−ϕ∗⇓↑

−ϕ∗⇓↓



.

(14.189)



234

The geometric algebra ⋄

Exercise 14.17 Translate a complex quaternion into a Dirac spinor. Show that the complex quaternion q ≡ w + ıx + y + kz is equivalent in the Dirac representation (14.170) to the 4 × 4 matrix q   wR + izR ixR + yR −iwI + zI xI − iyI    ixR − yR wR − izR xI + iyI −iwI − zI  wR xR yR zR  . (14.190) ↔q= q=  −iwI + zI xI − iyI wR + izR ixR + yR  wI xI yI zI xI + iyI −iwI − zI ixR − yR wR − izR Show that the reverse quaternion q¯, the complex conjugate (with respect to I) quaternion q ∗ , and the reverse complex conjugate (with respect to I) quaternion q¯∗ are respectively equivalent to the 4 × 4 matrices q¯ ↔ q¯ ≡ −γ γ0 q † γ 0 ,

(14.191a)





(14.191b)





(14.191c)

q ↔ q¯ = −γ γ0 qγ γ0 ,

¯γ0 . q¯ ↔ q = −γ γ0 qγ

Conclude that the Dirac spinor ϕ ≡ q¯ ⇑↑ corresponding to the reverse complex quaternion q¯ is   wR − izR  −ixR + yR   ϕ ≡ q¯ ⇑↑ =   −iwI − zI  , −xI − iyI

(14.192)

that the reverse spinor ϕ¯ ≡ (⇑↑)† q is ϕ¯ ≡ (⇑↑)† q =

wR + izR

ixR + yR

−iwI + zI

xI − iyI

iwI − zI

−xI + iyI

and that the Hermitian conjugate spinor ϕ† ≡ (⇑↑)† q¯† is ϕ† ≡ (⇑↑)† q¯† =

wR + izR

ixR + yR

 

,

(14.193)

.

(14.194)

Hence conclude that ϕϕ ¯ and ϕ† ϕ are respectively the real part of, and the absolute value of, the complex magnitude squared q¯q ≡ λ2 of the complex quaternion q, ϕϕ ¯ = λ2R − λ2I , †

ϕ ϕ=

λ2R

+

λ2I

(14.195a)

,

(14.195b)

with



2 2 2 λ2R = wR + x2R + yR + zR ,

(14.196a)

λ2I

(14.196b)

=

wI2

+

x2I

+

yI2

+

zI2

,

14.24 Non-null Dirac spinor — particle and antiparticle

235

Exercise 14.18 Relation between ϕ† and ϕ. ¯ Confirm equation (14.184) by showing from equation (14.191b) that ¯γ0 . ϕ† = i(⇑↑)† qγ

(14.197)



14.24 Non-null Dirac spinor — particle and antiparticle A non-null, or massive, Dirac spinor ϕ, one for which ϕϕ ¯ 6= 0, is isomorphic (14.181) to a non-null reverse complex quaternion q¯, which can be factored as a non-zero complex scalar times a unit quaternion, a rotor. Thus a non-null Dirac spinor can be expressed as the product of a complex scalar λ = λR + IλI and a reverse ¯ acting on the time-up spin-up eigenvector ⇑↑, Lorentz rotor R, ¯ ⇑↑ . ϕ = λR

(14.198)

The complex scalar λ can be taken without loss of generality to lie in the right hemisphere of the complex plane (positive real part), since a minus sign can be absorbed into a spatial rotation by 2π of the rotor ¯ There is no further ambiguity in the decomposition (14.198) into scalar and rotor, because the squared R. ¯ R ¯ = λ2 of the scaled rotor λR ¯ is the same for any decomposition. magnitude λRλ The fact that a non-null Dirac spinor ϕ encodes a Lorentz rotor shows that a non-null Dirac spinor in some sense “knows” about the Lorentz structure of spacetime. It is intriguing that the Lorentz structure of spacetime is built in to a non-null Dirac particle. As discussed in §14.22, a pure time-up eigenvector ⇑ represents a particle in its own rest frame, while a pure time-down eigenvector ⇓ represents an antiparticle in its own rest frame. The time-up spin-up eigenvector ⇑↑ ¯ = 1, so in this case the scalar λ is pure real. is by definition (14.198) equivalent to the unit scaled rotor, λR Lorentz transforming the eigenvector multiplies it by a rotor, but leaves the scalar λ unchanged, therefore pure real. Conversely, if the time-up spin-up eigenvector ⇑↑ is multiplied by the imaginary I, then according to the expression (14.192) the resulting spinor can be Lorentz transformed into a pure ⇓ state, corresponding to a pure antiparticle. Thus one may conclude that the real and imaginary parts (with respect to I) of the complex scalar λ = λR + IλI correspond respectively to particles and antiparticles. The magnitude squared ϕ† ϕ of the Dirac spinor, equation (14.195b), ϕ† ϕ = |λ|2 = λ2R + λ2I ,

(14.199)

is the sum of the probabilities λ2R of particles and λ2I of antiparticles. Among other things, the decomposition of a Dirac spinor into its particle and antiparticle parts shows that multiplying a non-null Dirac spinor by the pseudoscalar I converts a particle to an antiparticle, and vice versa.

236



The geometric algebra

14.25 Null Dirac Spinor A null spinor is a spinor ϕ whose magnitude is zero, ϕϕ ¯ =0.

(14.200)

Such a spinor is equal to a null complex quaternion q¯ acting on the time-up spin-up eigenvector ⇑↑, ϕ = q¯ ⇑↑ .

(14.201)

Physically, a null spinor represents a spin- 12 particle moving at the speed of light. A non-trivial null spinor must be moving at the speed of light because if it were not, then there would be a rest frame where the rotor ¯ ⇑↑ would be unity, R ¯ = 1, and the spinor, being non-trivial, λ 6= 0, would not be part of the spinor ϕ = λR null. The null condition (14.200) is a complex constraint, which eliminates 2 of the 8 degrees of freedom of a complex quaternion, so that a null spinor has 6 degrees of freedom. Any non-trivial null complex quaternion q can be written uniquely as the product of a null factor √ (1 + Iı · n)/ 2 and a real quaternion λU (Exercise 14.10): q=

(1 + Iı · n) √ λU . 2

(14.202)

Here n is a unit real 3-vector, λ is a positive real scalar, and U is a purely spatial (i.e. real, with no I part) √ rotor. The factor of 1/ 2 is inserted for normalization purposes. Physically, equation (14.202) contains the instruction to boost to light speed in the direction n, then scale by the real scalar λ and rotate spatially by U . The 2 + 1 + 3 = 6 degrees of freedom from the real unit vector n, the real scalar λ, and the spatial rotor U in the expression (14.202) are precisely the number needed to specify a null quaternion. One might have thought that the boost factor 1 + Iı · n in equation (14.202) would change under a Lorentz transformation, but in fact it is Lorentz-invariant. For if the boost factor 1+Iı·n is transformed (multiplied) by any complex quaternion p + Ir, then the result (1 + Iı · n)(p + Ir) = (1 + Iı · n)(p − ı · n r)

(14.203)

is the same unchanged boost factor 1+Iı·n multiplied by a purely spatial transformation, the real quaternion p−ı·n r. Equation (14.203) is true because (ı·n)2 = −1. Since the boost factor 1+Iı·n is Lorentz-invariant, Lorentz transforming the null quaternion q (14.202) probes only 4 of the 6 degrees of freedom of the group of null quaternions. The null Dirac spinor ϕ corresponding to the reverse q¯ of the null complex quaternion q, equation (14.202), is ¯ (1 −√Iı · n) ⇑↑ . (14.204) ϕ ≡ q¯ ⇑↑ = λU 2 It is natural to choose basis vectors of the representation to be eigenvectors of the Lorentz-invariant boost factor 1 − Iı · n. In the Dirac representation (14.170), the basis spinors (14.176) are eigenvectors of Iσ3 , and it natural to choose the 3-direction to be in either the positive or negative n direction, in which case (1 − Iı · n) ⇑↑ = (1 ∓ Iı3 ) ⇑↑ = (1 ± σ3 ) ⇑↑ = (⇑ ∓ ⇓) ↑ .

(14.205)

14.26 Chiral decomposition of a Dirac spinor

237

The null basis vectors are left- and right-handed chiral eigenvectors, eigenvectors of the chiral operator γ5 with eigenvalues ∓ respectively, (⇑∓⇓) ↑ (⇑∓⇓) ↑ =∓ √ . (14.206) γ5 √ 2 2 √ The 1/ 2 factor ensures unit normalization  † (⇑∓⇓) ↑ (⇑∓⇓) ↑ √ √ =1. (14.207) 2 2 ¯ acting on one of A general null Dirac spinor ϕ, equation (14.204), is the real scaled spatial reverse rotor λU √ √ the two null chiral basis spinors, either left-handed (⇑−⇓) ↑/ 2, or right-handed (⇑+⇓) ↑/ 2, ¯ ϕL = λU

(⇑−⇓) ↑ √ , 2

¯ ϕR = λU

(⇑+⇓) ↑ √ . 2

(14.208)

A left- or right-handed null Dirac spinor is called a Weyl spinor. Concept question 14.19 The null boost factor (1 + Iı · n) in a null quaternion, equation (14.202), is Lorentz-invariant, as shown by equation (14.203) (which you should confirm). Consequently a null Dirac spinor has a Lorentz-invariant boost axis n. Does a null 4-vector have a Lorentz-invariant axis? What does it mean physically that a null Dirac spinor has a Lorentz-invariant boost axis n?

14.26 Chiral decomposition of a Dirac spinor A general (non-null or null) Dirac spinor ϕ can be decomposed into a sum of left- and right-handed chiral components ϕ = ϕL + ϕR ,

(14.209)

that are eigenvectors of the chiral operator γ5 , γ5 ϕL = ∓ϕL . R

(14.210)

R

The left- and right-handed chiral components can be projected out by applying the chiral projection operators 1 2 (1 ∓ γ5 ) (which are projection operators because their squares are themselves): 1 (1 ∓ γ5 )ϕ = ϕL . 2 R

(14.211)

The decomposition into chiral components is Lorentz-invariant because the pseudoscalar I, hence the chiral operator γ5 ≡ iI, is Lorentz-invariant, which is true because the pseudoscalar I commutes with any Lorentz rotor. The chiral projection operators are null, because the reverse of the chiral operator is γ¯5 = iI = −iI = −γ5 , and its square is one, γ52 = 1, so   (14.212) 1 + γ5 (1 + γ5 ) = (1 − γ5 ) 1 − γ5 = (1 − γ5 ) (1 + γ5 ) = 0 .



238

The geometric algebra

Consequently each of the chiral components is null, hence massless, ϕ¯L ϕL = ϕ¯R ϕR = 0 .

(14.213)

Since a pure left- or right-handed spinor must be null, a non-null Dirac particle cannot be purely left- or right-handed.

14.27 Dirac equation The Dirac equation is the relativistic quantum mechanical wave equation for spin- 21 particles. By itself, the Dirac equation does not provide a consistent theory of relativistic quantum mechanics, because in relativistic quantum mechanics there is no such thing as a single particle that evolves in isolation. Rather, a “fundamental” particle such as an electron is dressed in a sea of particle-antiparticle pairs polarized out of the vacuum by the presence of the electron. Nevertheless, the Dirac equation is a fundamental building block for the quantum field theory of spin- 12 particles. The Dirac theory starts with the momentum 4-vector in the form p = γm pm , where γm are not only the basis vectors of an orthonormal tetrad, but also the basis vectors of the spacetime (Clifford) algebra. In the Dirac slash notation p =6 p, but the notation is superfluous here. For a particle of rest mass m, the geometric square of the momentum is pp = γm γn pm pn = pn pn = −m2 .

(14.214)

The vanishing sum pp + m2 factors as pp + m2 = (p + im)(p − im) = 0 .

(14.215)

The factorization provides the motivation for the Dirac wave equation for a free relativistic spin- 21 particle or antiparticle, (p − im)ϕ = 0 (particle) ,

(p + im)ϕ = 0 (antiparticle) ,

(14.216a) (14.216b)

in which ϕ is a Dirac spinor, and the momentum operator p ≡ γ m pm should be replaced, according to the usual rules of quantum mechanics, by minus i times the gradient operator ∂ = γ m ∂m , p = −i∂ .

(14.217)

With respect to locally inertial coordinates xm ≡ {t, xi },

∂ ∂ , pi = pi = −i i . ∂t ∂x With the replacement (14.217), the Dirac equations (14.216) are p0 = −p0 = i

(14.218)

(∂ + m)ϕ = 0

(particle) ,

(14.219a)

(∂ − m)ϕ = 0

(antiparticle) .

(14.219b)

14.28 Antiparticles are negative mass particles moving backwards in time

239

The reason for choosing opposite signs for the mass m in the particle and antiparticle equations is discussed further in §14.28. For now, two comments can be made about the different choice of sign. Firstly, as seen in §14.24, particles and antiparticles belong to distinct representations that do not mix under Lorentz transformations, so it is consistent to allow them to satisfy different equations. Secondly, the factorization (14.215) involves factors with both signs of m, so it is reasonable — one could say demanded — that both equations would occur. In flat (Minkowski) space, the Dirac wave equations (14.219) for a free particle or antiparticle are most easily solved by Fourier transforming with respect to space and time. The differential wave equations (14.219) then revert to being algebraic equations (14.216), with p being the momentum of the corresponding Fourier mode. If the Dirac spinor ϕ is a particle as opposed to an antiparticle, so that ϕ is (up to an irrelevant real ¯ where ⇑ is any rest-frame particle eigenvector, then the following calculation scale factor) R⇑ ¯ γ0 R)(R⇑) ¯ = mRγ ¯ γ0 ⇑ = imR⇑ ¯ = imϕ (particle) , pϕ = (Rmγ

(14.220)

¯ γ0 R)(R⇓) ¯ = mRγ ¯ γ0 ⇓ = −imR⇓ ¯ = −imϕ (antiparticle) pϕ = (Rmγ

(14.221)

¯ γ0 R, which confirms that the particle Dirac equation (14.216a) recovers the expected momentum p = Rmγ m is the rest frame momentum p ≡ p γm = mγ γ0 Lorentz transformed into the tetrad frame. Likewise, if the Dirac spinor ϕ is an antiparticle as opposed to a particle, so that ϕ is (up to an irrelevant real scale factor) ¯ where ⇓ is any rest-frame antiparticle eigenvector, then R⇓ ¯ γ0 R. confirms that the antiparticle Dirac equation (14.216b) again recovers the expected momentum p = Rmγ The sign flip between the particle and antiparticle equations (14.220) and (14.221) occurs because the time basis vector γ0 yields opposite signs when acting on a rest-frame particle eigenvector ⇑ versus restframe antiparticle eigenvector ⇓, equation (14.177). The same sign flip occurs if the rest-frame antiparticle eigenvector is taken to be I⇑ (per §14.24) instead of ⇓, since γ0 anticommutes with I. Fourier-transformed back into real space, the free Dirac spinor wavefunctions ϕ in flat space are, for either particles or antiparticles, FREQUENCY HAS WRONG SIGN ϕ = ϕ0 eipm x

m

,

(14.222)

where pm is the momentum, satisfying p = γ m pm , and ϕ0 is the value of the Dirac spinor at the origin m xm = 0. Regarded as an element of the spacetime algebra, the exponential factor eipm x is a scalar, so it commutes with ϕ0 , so it does not matter on which side of ϕ0 the exponential is placed.

14.28 Antiparticles are negative mass particles moving backwards in time The original Dirac treatment took the particle Dirac equation (14.219a) as describing both particles and antiparticles. This led to solutions in which the free-wave factor contained not only a positive frequency m component, as in equation (14.222), but also a negative frequency component e−ipm x . These negative frequency components were interpreted as indicating an antiparticle, with negative mass m. In the original Dirac theory, the prediction of particles with negative rest mass was problematic: where are

240



The geometric algebra

such particles? And if they existed, why wouldn’t pairs of positive and negative mass particles spontaneously pop out of the vacuum, causing a catastrophic breakdown of the vacuum? To solve the problem, Dirac proposed that all negative energy states of the vacuum are already occupied, and that antiparticles correspond to holes in the negative energy sea. Dirac’s conundrum was eventually solved by Feynman, who realised that anti-particles are equivalent to negative mass particles moving backwards in time. A negative mass particle moving backwards in time looks like a positive mass particle moving forwards in time. Feynman’s solution obviates the need for any negative energy sea of antiparticles. Feynman’s dictum corresponds mathematically to choosing particle and antiparticle spinors not only to yield opposite signs when acted on by the time basis vector γ0 in their rest frames, as in Dirac theory, but also to have the opposite sign of mass m in the Dirac equations (14.219a). This is the approach adopted in the previous section, §14.27.

14.29 Dirac equation with electromagnetism The Dirac equation for a spin- 21 particle of charge e moving in an external electromagnetic field with potential A ≡ Am γm is given by the same Dirac equations (14.216), but now the momentum p is rewritten in terms of the canonical momentum π, p = π + eA ,

(14.223)

and it is the canonical momentum π that is replaced by minus i times the gradient operator ∂ ≡ γ m ∂m , π = −i∂ .

(14.224)

Consequently the Dirac equation for charged particle or antiparticle is (∂ + ieA + m)ϕ = 0 (particle) ,

(14.225a)

(∂ + ieA − m)ϕ = 0 (antiparticle) .

(14.225b)

Equation (14.225b) appears to describe an antiparticle as having mass −m opposite to that of a particle, and charge e the same as that of a particle. If the antiparticle is interpreted as having negative mass moving backwards in time, then the antiparticle has positive mass m moving forwards in time, and charge −e opposite to that of a particle. Equations (14.225) are not easy to solve in general. Analytic solutions exist in some cases, such as when the electromagnetic field consists of a uniform magnetic field B.

14.30 CP T It was seen in §14.24 that multiplying a non-null Dirac spinor ϕ by the pseudoscalar I converts a particle spinor into an antiparticle spinor, and vice versa. This operation is conventionally called CP T , CP T : ϕ → Iϕ .

(14.226)

14.31 Charge conjugation C

241

The operation is called CP T because it is conventionally parsed into 3 distinct discrete transformations C, P , and T , discussed in turn below. In the spacetime algebra, the CP T operation flips the sign of all the spacetime axes γm , CP T : γm → Iγ γm I −1 = −γ γm ,

(14.227)

which is true because the pseudoscalar I anti-commutes with each of the basis vectors γm . The CP T operation leaves all even multivectors unchanged since I commutes with all even multivectors. In particular, CP T is Lorentz invariant, since I commutes with Lorentz rotors. The CP T operation converts the particle Dirac equation (14.225a) into the antiparticle Dirac equation (14.225b): CP T : −I(∂ + ieA + m)ϕ = (∂ + ieA − m)Iϕ .

(14.228)

Equation (14.228) shows that the spinor Iϕ satisfies the Dirac equation (14.225b) for an antiparticle, consistent with conclusion of §14.24 that multiplying a non-null Dirac spinor by I converts a particle into an antiparticle.

14.31 Charge conjugation C A Dirac spinor, non-null or null, contains two distinct components that remain separate under Lorentz transformations. For a non-null spinor the two components are particles and antiparticles. For a null spinor the two components are the left- and right-handed chiralities. For a non-null spinor, the CP T operation of multiplying the spinor by I converts a particle into an antiparticle and versa. But for a null spinor, multiplying by I = −iγ5 leaves left- and right-handed particles as they are: it does not transform opposite chiralities into each other. A charge conjugation operation C can be defined with the property that it converts particles of one type into the opposite type for both non-null and null spinors: particles into antiparticles and vice versa, and left-handed into right-handed chiralities and vice versa. In quantum mechanics, it is natural to regard particles and antiparticles as belonging to complex (with respect to i) conjugate representations. The charge conjugation operator C is defined by the requirement that it transforms the spacetime basis vectors γm to their complex conjugates (with respect to i) ∗ C : γm → Cγ γm C −1 = γm .

(14.229)

In the Dirac representation (14.170), the condition (14.229) requires that C commute with γ2 , but anticommute with γ0 , γ1 , and γ3 . A suitable matrix is γ2 itself, C = γ2 .

(14.230)

Charge conjugation of a Dirac spinor ϕ is accomplished by taking the complex conjugate (with respect to i) of the spinor, and multiplying by the charge conjugation operator C: C : ϕ → Cϕ∗ .

(14.231)



242

The geometric algebra

The charge-conjugates Cϕ∗ of the non-null basis eigenvectors (14.176) in the Dirac representation are C(⇑↑)∗ = ⇓↓ ,

C(⇑↓)∗ = − ⇓↑ ,

C(⇓↑)∗ = − ⇑↓ ,

C(⇓↓)∗ = ⇑↑ ,

while the charge-conjugates of the null chiral basis eigenvectors (14.206) are  ∗ (⇑ ∓ ⇓) ↑ (⇑ ± ⇓) ↓ √ C =± √ . 2 2

(14.232)

(14.233)

Equations (14.232) and (14.233) show that charge conjugation not only converts rest-frame particle eigenvectors ⇑ into rest-frame antiparticle eigenvectors ⇓, and vice versa, but also converts a left-handed null eigenvector into a right-handed null eigenvector and vice versa. Since the operation of complex conjugation commutes with Lorentz transformation (because Lorentz rotors R are independent of the quantum mechanical imaginary i), it is true in general that charge conjugation switches particles into antiparticles, and left-handed into right-handed chiralities. The complex conjugate (with respect to i) of the charged particle Dirac equation (14.225a) is (∂ ∗ − ieA∗ + m)ϕ∗ = 0 .

(14.234)

Complex conjugation leaves the components ∂ m and Am of the gradient operator and electromagnetic potential unchanged, but conjugates the basis vectors γm . Left-multiplying equation (14.234) by C, and commuting C through the wave operator, yields (∂ − ieA + m)Cϕ∗ = 0 .

(14.235)

Thus the charge-conjugated spinor Cϕ∗ satisfies the Dirac equation (14.225a) for a particle with the same mass m but opposite charge −e.

14.32 Parity reversal P The parity operation P is the operation of reversing all the spatial axes, while keeping the time axis unchanged,  γm m=0, P : γm → P γm P −1 = (14.236) −γ γm m = 1, 2, 3 . A suitable matrix is the time axis γ0 P = γ0 .

(14.237)

P : ϕ → Pϕ .

(14.238)

Parity reversal transforms a Dirac spinor ϕ as

Parity reversal commutes with spatial rotations, but not with Lorentz boosts. Parity reversal transforms the left- and right-handed chiral components of a Dirac spinor into each other: P : ϕL ↔ ϕR ,

(14.239)

14.33 Time reversal T

243

which is true because parity reversal flips the sign of the pseudoscalar, P IP −1 = −I, hence also the sign of the chiral operator γ5 ≡ iI.

14.33 Time reversal T The Wigner time reversal operation T is conventionally defined so that CP T is the product of the three operations C, P , and T . This requires that time-reversal transforms a Dirac spinor ϕ as T : ϕ → T ϕ∗

(14.240)

T = γ1 γ3 ,

(14.241)

CP T = γ2 γ0 γ1 γ3 = I .

(14.242)

with

so that

14.34 Majorana spinor The charge conjugation operation considered in §14.31 switched left- and right-handed null spinors into each other. However, it is possible for a null spinor to be its own antiparticle. In this case the complex (with respect to i) conjugate representation is itself, rather than being a distinct representation. A null spinor which is its own antiparticle is a Majorana spinor. For a Majorana spinor, complex conjugation should leave the representation unchanged; that is, the γmatrices should be real, in contrast to the Dirac representation (14.170), where the γ-matrices are complex. A suitable real representation of the γ-matrices is         0 iσ2 σ1 0 0 −iσ2 σ3 0 γ0 = , γ1 = , γ2 = , γ3 = . (14.243) iσ2 0 iσ2 0 0 σ1 0 σ3 The chiral matrix γ5 ≡ iI is γ5 =



σ2 0

0 σ2



.

(14.244)

14.35 Covariant derivatives revisited ¯ Under a Lorentz transformation by rotor R, any multivector a transforms as a → RaR. The covariant m derivative D ≡ γ Dm must transform likewise as ¯ R : D → RDR .

(14.245)

244



The geometric algebra

14.36 General relativistic Dirac equation To convert the Dirac equations (14.219) or (14.225) into general relativistic equations, the derivative ∂ ≡ γ m ∂m must be converted to a covariant derivative.

14.37 3D Vectors as rank-2 spinors THIS NEEDS TO BE CONVERTED FROM 3D TO SPACETIME. Concept question 14.20 One is used to thinking of a spin- 21 particle as in some sense the square root of a spin-1 particle, a vector. How is this concept compatible with the idea that a spin- 21 object is a (scaled) rotor? A Pauli spinor ϕ contains two complex components ϕa , where the index a runs over the two indices ↑ and ↓. The Pauli spinor is a spinor of rank 1, having one spinor index. Under a rotation by rotor R, the spinor ¯ Under a rotation, the spinor components ϕa of the Pauli spinor transform as ϕ transforms as ϕ → Rϕ. ϕa → Rb a ϕb

(14.246)

¯ The rotor R itself has where Rb a is the special unitary 2 × 2 matrix representing the reverse rotor R. b components Ra . Note the placement of indices: for the rotor R, the first index is down and the second up, ¯ the first index is up and the second down. The rotor and its reverse are inverse while for the reverse rotor R, ¯ = Rc a Rc b = δ b . to each other, satisfying RR a

The Hermitian conjugate spinor ϕ† transforms in the opposite way ϕ† → ϕ† R, and can therefore be written as a spinor with contravariant (raised) index, ϕ† = ϕa . The contravariant spinor components ϕa transform as ϕa → ϕb Rb a .

(14.247)

The product of a Hermitial conjugate spinor χ† with another spinor ϕ defines their scalar product, which is unchanged by a rotation, ¯ = χ† ϕ = χa ϕa . χ† ϕ → χ† RRϕ

(14.248)

Explicitly, the scalar product χa ϕa is the complex number χa ϕa = χ∗↑ ϕ↑ + χ∗↓ ϕ↓ .

(14.249)

¯ An element a of the 3D geometric algebra transforms under rotation as a → RaR. The behaviour under b rotation shows that a 3D multivector is a rank-2 spinor a = aa , a complex 2 × 2 matrix, with the first index being covariant (lowered), and the second being contravariant (raised). The spinor components aa b of the multivector transform as aa b → Rc a ac d Rd b .

(14.250)

14.37 3D Vectors as rank-2 spinors

245

A 3D multivector a contains 8 real components, comprising a scalar, a vector, a pseudovector, and a pseudoscalar. All 8 complex components are represented by the complex 2 × 2 spinor aa b . The scalar and pseudoscalar components are contained in the complex trace aa a of the spinor. The rotational transforma¯ tion a → RaR preserves the grade of the multivector, so scalars transform to scalars, vectors to vectors, pseudovectors to pseudovectors, and pseudoscalars to pseudoscalars. ¯ aR as the parent multivector a. Consquently, ¯ transforms in the same way a ¯ → R¯ A reversed multivector a ¯ transforms like a rank-2 spinor with one covariant (lowered) like the multivector a, the reversed multivector a and one contravariant (raised) index. It is consistent to write the reversed spinor as the original spinor with ¯ = aa b . the first index raised and the second lowered, a The above has shown that a 3D multivector a is represented naturally as a rank-2 spinor aa b with one covariant and one contravariant index. This might seem strange: one might have expected that a vector — a spin-1 particle — might be represented as a rank-2 spinor aab with two covariant indices — a sum of tensor products ϕ ⊗ χ = ϕa χb of two spin- 12 particles. Under a rotation, the components aab of a covariant rank-2 spinor transform as aab → Rc a Rd b acd .

(14.251)

One aspect of this transformation is straightforward: the following combination of tensor products of spinors is invariant under rotation, and is therefore a scalar: ↑⊗↓−↓⊗↑ .

(14.252)

The remaining tensor combinations ↑ ⊗ ↑, ↓ ⊗ ↓, and ↑ ⊗ ↓ + ↓ ⊗ ↑ provide a basis for vectors, but under rotation they transform into complex, not real, linear combinations of each other. This contrasts with the representation of multivectors a by spinors aa b with one covariant and one contravariant index, where the ¯ grade-preserving property of the rotational transformation a → RaR ensures that vectors rotate into real, not complex, linear combinations of each other.

PART SIX BLACK HOLE INTERIORS

Concept Questions

1. Explain how the equation for the Gullstrand-Painlev´e metric (15.16) encodes not merely a metric but a full vierbein. 2. In what sense does the Gullstrand-Painlev´e metric (15.16) depict a flow of space? [Are the coordinates moving? If not, then what is moving?] 3. If space has no substance, what does it mean that space falls into a black hole? 4. Would there be any gravitational field in a spacetime where space fell at constant velocity instead of accelerating? 5. In spherically symmetric spacetimes, what is the most important Einstein equation, the one that causes Reissner-Nordstr¨ om black holes to be repulsive in their interiors, and causes mass inflation in non-empty (non Reissner-Nordstr¨ om) charged black holes?

What’s important?

1. The tetrad formalism provides a firm mathematical foundation for the concept that space falls faster than light inside a black hole. 2. Whereas the Kerr-Newman geometry of an ideal rotating black hole contains inside its horizon wormhole and white hole connections to other universes, real black holes are subject to the mass inflation stability discovered by Eric Poisson & Werner Israel (1990, “Internal structure of black holes,” Phys. Rev. D 41, 1796-1809).

15 Black hole waterfalls

15.1 Tetrads move through coordinates As already discussed in §11.3, the way in which metrics are commonly written, as a (weighted) sum of squares of differentials, ds2 = γmn em µ en ν dxµ dxν ,

(15.1)

encodes not only a metric gµν = γmn em µ en ν , but also an inverse vierbein em µ , and consequently a vierbein em µ , and associated tetrad γm . Most commonly the tetrad metric is orthonormal (Minkowski), γmn = ηmn , but other tetrad metrics, such as Newman-Penrose, occur. Usually it is self-evident from the form of the line-element what the tetrad metric γmn is in any particular case. If the tetrad is orthonormal, γmn = ηmn , then the 4-velocity um of an object at rest in the tetrad, or equivalently the 4-velocity of the tetrad rest frame itself, is um = {1, 0, 0, 0} .

(15.2)

The tetrad-frame 4-velocity (15.2) of the tetrad rest frame is transformed to a coordinate-frame 4-velocity uµ in the usual way, by applying the vierbein, dxµ ≡ u µ = em µ u m = e0 µ . dτ

(15.3)

Equation (15.3) says that the tetrad rest frame moves through the coordinates at coordinate 4-velocity given by the zero’th row of the vierbein, dxµ /dτ = e0 µ . The coordinate 4-velocity uµ is related to the lapse α and shift β α in the ADM formalism by uµ = {1, β α }/α, equation (13.10). The idea that locally inertial frames move through the coordinates provides the simplest way to conceptualize black holes. The motion of locally inertial frames through coordinates is what is meant by the “dragging of inertial frames” around rotating masses.

252

Black hole waterfalls

Figure 15.1 The fish upstream can make way against the current, but the fish downstream is swept to the bottom of the waterfall. Art by Wildrose Hamilton.

15.2 Gullstrand-Painlev´ e waterfall The Gullstrand-Painlev´e metric is a version of the metric for a spherical (Schwarzschild or Reissner-Nordstr¨ om) black hole discovered in 1921 independently by Allvar Gullstrand (1922, “Allgemeine L¨ osung des statischen Eink¨orperproblems in der Einsteinschen Gravitationstheorie,” translated by http://babelfish.altavista. com/tr as “General solution of the static body problem in Einstein’s Gravitation Theory,” Arkiv. Mat. Astron. Fys. 16(8), 1–15) and Paul Painlev´e (1921, “La m´ecanique classique et la th´eorie de la relativit´e”, C. R. Acad. Sci. (Paris) 173, 677–680). Although Gullstrand’s paper was published in 1922, after Painlev´e’s, it appears that Gullstrand’s work has priority. Gullstrand’s paper was dated 25 May 1921, whereas Painlev´e’s is a write up of a presentation to the Acad´emie des Sciences in Paris on 24 October 1921. Moreover, Gullstrand seems to have had a better grasp of what he had discovered than Painlev´e, for Gullstrand recognized that observables such as the redshift of light from the Sun are unaffected by the choice of coordinates in the Schwarzschild geometry, whereas Painlev´e, noting that the spatial metric was flat at constant free-fall time, dtff = 0, concluded in his final sentence that, as regards the redshift of light and such, “c’est pure imagination de pr´etendre tirer du ds2 des cons´equences de cette nature”. Although neither Gullstrand nor Painlev´e understood it, their metric paints a picture of space falling like a river, or waterfall, into a spherical black hole, Figure 15.1. The river has two key features: first, the river flows in Galilean fashion through a flat Galilean background, equation (15.19); and second, as a freely-falling fishy swims through the river, its 4-velocity, or more generally any 4-vector attached to it, evolves by a series of infinitesimal Lorentz boosts induced by the change in the velocity of the river from place to place, equation (15.24). Because the river moves in Galilean fashion, it can, and inside the horizon does, move

15.2 Gullstrand-Painlev´e waterfall

253

faster than light through the background coordinates. However, objects moving in the river move according to the rules of special relativity, and so cannot move faster than light through the river.

15.2.1 Gullstrand-Painlev´ e tetrad The Gullstrand-Painlev´e metric (7.26) is ds2 = − dt2ff + (dr − β dtff )2 + r2 (dθ2 + sin2 θ dφ2 ) ,

(15.4)

where β is defined to be the radial velocity of a person who free-falls radially from rest at infinity, β=

dr dr , = dτ dtff

(15.5)

and tff is the free-fall time, the proper time experienced by a person who free-falls from rest at infinity. The radial velocity β is the (apparently) Newtonian escape velocity r 2M (r) β=∓ , (15.6) r where M (r) is the interior mass within radius r, and the sign is − (infalling) for a black hole, + (outfalling) for a white hole. For the Schwarzschild or Reissner-Nordstr¨ om geometry the interior mass M (r) is the mass M at infinity minus the mass Q2 /2r in the electric field outside r, M (r) = M −

Q2 . 2r

(15.7)

Figure 15.2 illustrates the velocity fields in Schwarzschild and Reissner-Nordstr¨ om black holes. Horizons occur where the radial velocity β equals the speed of light β = ∓1 ,

(15.8)

with − for black hole solutions, + for white hole solutions. The phenomenology of Schwarzschild and Reissner-Nordstr¨ om black holes has already been explored in Chapters 7 and 8.

Exercise 15.1 Schwarzschild to Gullstrand-Painlev´ e. Show that the Schwarzschild metric transforms into the Gullstrand-Painlev´e metric under the coordinate transformation of the time coordinate dtff = dt −

β dr . 1 − β2

(15.9)

Exercise 15.2 Radial free-fall from rest. Confirm that β given by equation (15.6) is indeed the velocity (15.5) of a person who free-falls radially from rest at infinity in the Reissner-Nordstr¨ om geometry.

254

Black hole waterfalls Horizon

O uter

h o riz o n

er horizon Inn rnaround Tu

Figure 15.2 Radial velocity β in (upper panel) a Schwarzschild black hole, and (lower panel) a ReissnerNordstr¨ om black hole with electric charge Q = 0.96.

The Gullstrand-Painlev´e line-element (15.4) encodes an inverse vierbein with an orthonormal tetrad metric γmn = ηmn through e0 µ dxµ = dtff , e

1

e

2

e

3

µ µ µ

(15.10a)

µ

(15.10b)

µ

(15.10c)

µ

(15.10d)

dx = dr − β dtff , dx = r dθ ,

dx = r sin θ dφ .

Explicitly, the inverse vierbein em µ of the Gullstrand-Painlev´e line-element (15.4), and the corresponding

15.2 Gullstrand-Painlev´e waterfall vierbein em µ , are the matrices  1  −β m e µ=  0 0

0 1 0 0

 0 0  0 0  ,  r 0 0 r sin θ

em µ

1  0 =  0 0 

β 1 0 0

0 0 1/r 0

255

 0  0  .  0 1/(r sin θ)

(15.11)

According to equation (15.3), the coordinate 4-velocity of the tetrad frame through the coordinates is   dtff dr dθ dφ (15.12) , , , = uµ = e0 µ = {1, β, 0, 0} , dτ dτ dτ dτ consistent with the claim (15.5) that β represents a radial velocity, while tff coincides with the proper time in the tetrad frame. The tetrad and coordinate axes γm and gµ are related to each other by the vierbein and inverse vierbein in the usual way, γm = em µ gµ and gµ = em µ γm . The Gullstrand-Painlev´e orthonormal tetrad axes γm are thus related to the coordinate axes gµ by γ0 = gtff + βgr ,

γ1 = gr ,

γ2 = gθ /r ,

γ3 = gφ /(r sin θ) .

(15.13)

Physically, the Gullstrand-Painlev´e-Cartesian tetrad (15.13) are the axes of locally inertial orthonormal frames (with spatial axes γi oriented in the polar directions r, θ, φ) attached to observers who free-fall radially, without rotating, starting from zero velocity and zero angular momentum at infinity. The fact that the tetrad axes γm are parallel-transported, without precessing, along the worldlines of the radially free-falling observers can be confirmed by checking that the tetrad connections Γnm0 with final index 0 all vanish, which implies that dγ γm = ∂0 γm ≡ Γnm0 γn = 0 . dτ

(15.14)

That the proper time derivative d/dτ in equation (15.14) of a person at rest in the tetrad frame, with 4-velocity (15.2), is equal to the directed time derivative ∂0 follows from d ∂ = uµ µ = um ∂m = ∂0 . dτ ∂x

(15.15)

15.2.2 Gullstrand-Painlev´ e-Cartesian tetrad The manner in which the Gullstrand-Painlev´e line-element depicts a flow of space into a black hole is elucidated further if the line-element is written in Cartesian rather than spherical polar coordinates. Introduce a Cartesian coordinate system xµ ≡ {tff , xi } ≡ {tff , x, y, z}. The Gullstrand-Painlev´e metric in these Cartesian coordinates is ds2 = − dt2ff + δij (dxi − β i dtff )(dxj − β j dtff ) ,

(15.16)

256

Black hole waterfalls

with implicit summation over spatial indices i, j = 1, 2, 3. The β i in the metric (15.16) are the components of the radial velocity expressed in Cartesian coordinates nx y z o βi = β . (15.17) , , r r r

The inverse vierbein em µ and vierbein em µ encoded in the Gullstrand-Painlev´e-Cartesian line-element (15.16) are     1 0 0 0 1 β1 β2 β3  −β 1 1 0 0   0 1 0 0  µ  .   (15.18) em µ =   −β 2 0 1 0  , em =  0 0 1 0  0 0 0 1 −β 3 0 0 1 The tetrad axes γm of the Gullstrand-Painlev´e-Cartesian line-element (15.16) are related to the coordinate tangent axes gµ by γ0 = gtff + β i gi ,

γi = gi ,

(15.19)

and conversely the coordinate tangent axes gµ are related to the tetrad axes γm by gtff = γ0 − β i γi ,

gi = γi .

(15.20)

Note that the tetrad-frame contravariant components β i of the radial velocity coincide with the coordinateframe contravariant components β i ; for clarification of this point see the more general equation (15.48) for a rotating black hole. The Gullstrand-Painlev´e-Cartesian tetrad axes (15.19) are the same as the tetrad axes (15.13), but rotated to point in Cartesian directions x, y, z rather than in polar directions r, θ, φ. Like the polar tetrad, the Cartesian tetrad axes γm are parallel-transported, without precessing, along the worldlines of radially free-falling observers, as can be confirmed by checking once again that the tetrad connections Γnm0 with final index 0 all vanish. Remarkably, the transformation (15.19) from coordinate to tetrad axes is just a Galilean transformation of space and time, which shifts the time axis by velocity β along the direction of motion, but which leaves unchanged both the time component of the time axis and all the spatial axes. In other words, the black hole behaves as if it were a river of space that flows radially inward through Galilean space and time at the Newtonian escape velocity.

15.2.3 Gullstrand-Painlev´ e fishies The Gullstrand-Painlev´e line element paints a picture of locally inertial frames falling like a river of space into a spherical black hole. What happens to fishies swimming in that river? Of course general relativity supplies a mathematical answer in the form of the geodesic equation of motion (15.21). Does that mathematical answer lead to further conceptual insight? Consider a fishy swimming in the Gullstrand-Painlev´e river, with some arbitrary tetrad-frame 4-velocity

15.2 Gullstrand-Painlev´e waterfall

257

um , and consider a tetrad-frame 4-vector pk attached to the fishy. If the fishy is in free-fall, then the geodesic equation of motion for pk is as usual dpk + Γkmn un pm = 0 . (15.21) dτ As remarked in §11.13, for a constant (for example Minkowski) tetrad metric, as here, the tetrad connections Γkmn constitute a set of four generators of Lorentz transformations, one in each of the directions n. In particular Γkmn un is the generator of a Lorentz transformation along the path of a fishy moving with 4velocity un . In a small (infinitesimal) time δτ , the fishy moves a proper distance δξ n ≡ un δτ relative to the infalling river. This proper distance δξ n = en ν δxν = δνn (δxν − β ν δtff ) = δxn − β n δτ equals the distance δxn moved relative to the background Gullstrand-Painlev´e-Cartesian coordinates, minus the distance β n δτ moved by the river. The geodesic equation (15.21) says that the change δpk in the tetrad 4-vector pk in the time δτ is δpk = −Γkmn δξ n pm .

(15.22)

Equation (15.22) describes an infinitesimal Lorentz transformation −Γkmn δξ n of the 4-vector pk . Equation (15.22) is quite general in general relativity: it says that as a 4-vector pk free-falls through a system of locally inertial tetrads, it finds itself Lorentz-transformed relative those tetrads. What is special about the Gullstrand-Painlev´e-Cartesian tetrad is that the tetrad-frame connections, computed by the usual formula (11.41), are given by the coordinate gradient of the radial velocity (the following equation is valid component-by-component despite the non-matching up-down placement of indices) Γ0ij = Γi0j =

∂β i ∂xj

(i, j = 1, 2, 3) .

(15.23)

The same property, that the tetrad connections are a pure coordinate gradient, holds also for the DoranCartesian tetrad for a rotating black hole, equation (15.51). With the connections (15.23), the change δpk (15.22) in the tetrad 4-vector is δp0 = − δβ i pi ,

δpi = − δβ i p0 ,

(15.24)

where δβ i is the change in the velocity of the river as seen in the tetrad frame, δβ i = δξ j

∂β i . ∂xj

(15.25)

But equation (15.24) is nothing more than an infinitesimal Lorentz boost by a velocity change δβ i . This shows that a fishy swimming in the river follows the rules of special relativity, being Lorentz boosted by tidal changes δβ i in the river velocity from place to place. Is it correct to interpret equation (15.25) as giving the change δβ i in the river velocity seen by a fishy? Of course general relativity demands that equation (15.25) be mathematically correct; the issue is merely one of interpretation. Shouldn’t the change in the river velocity really be ∂β i ? , δβ i = δxν ∂xν

(15.26)

258

Black hole waterfalls

where δxν is the full change in the coordinate position of the fishy? No. Part of the change (15.26) in the river velocity can be attributed to the change in the velocity of the river itself over the time δτ , which is δxνriver ∂β i /∂xν with δxνriver = β ν δτ = β ν δtff . The change in the velocity relative to the flowing river is ∂β i ∂β i ν ν = (δx − β δt ) (15.27) ff ∂xν ∂xν which reproduces the earlier expression (15.25). Indeed, in the picture of fishies being carried by the river, it is essential to subtract the change in velocity of the river itself, as in equation (15.27), because otherwise fishies at rest in the river (going with the flow) would not continue to remain at rest in the river. δβ i = (δxν − δxνriver )

15.3 Boyer-Lindquist tetrad The Boyer-Lindquist metric for an ideal rotating black hole was explored already in Chapter 9. With the tetrad formalism in hand, the advantages of the Boyer-Lindquist tetrad for portraying the Kerr-Newman geometry become manifest. With respect to the orthonormal Boyer-Lindquist tetrad, the electromagnetic field is purely radial, and the energy-momentum and Weyl tensors are diagonal. The Boyer-Lindquist tetrad is aligned with the principal (ingoing or outgoing) null congruences. The Boyer-Lindquist orthonormal tetrad is encoded in the Boyer-Lindquist metric ds2 = − where R≡

 2 ρ2 ∆ R4 sin2 θ  a 2 dφ − 2 dt dt − a sin2 θ dφ + dr2 + ρ2 dθ2 + 2 2 ρ ∆ ρ R

p r 2 + a2 ,

ρ≡

p r2 + a2 cos2 θ ,

∆ ≡ R2 − 2M r + Q2 = R2 (1 − β 2 ) .

Explicitly, the vierbein em µ of the Boyer-Lindquist orthonormal tetrad √  2 √ 0 a/ ∆ R / ∆ √0 1 ∆ 0 0 0 em µ =   0 0 1 0 ρ a sin θ 0 0 1/ sin θ with inverse vierbein em µ

em µ

√ ∆/ρ  0 =  0 − a sin θ/ρ 

0 √ ρ/ ∆ 0 0

0 0 ρ 0

is 

  , 

√ − a sin2 θ ∆/ρ 0 0 R2 sin θ/ρ

(15.28)

(15.29)

(15.30)



  . 

(15.31)

With respect to the Boyer-Lindquist tetrad, only the time component At of the electromagnetic potential Am is non-vanishing,   Qr m √ , 0, 0, 0 . (15.32) A = ρ ∆

15.4 Doran waterfall

259

Only the radial components Er and Br of the electric and magnetic fields are non-vanishing, and they are given by the complex combination Q , (15.33) Er + i Br = (r − ia cos θ)2 or explicitly  Q r2 −a2 cos2 θ Er = , ρ4

Br =

2Qar cos θ . ρ4

(15.34)

The electrogmagnetic field (15.33) satisfies Maxwell’s equations (11.64) and (11.65) with zero electric charge and current, j n = 0, except at the singularity ρ = 0. The non-vanishing components of the tetrad-frame Einstein tensor Gmn are   1 0 0 0 Q2  0 −1 0 0   , (15.35) Gmn = 4  ρ  0 0 1 0  0

0

0 1

which is the energy-momentum tensor of the electromagnetic field. The non-vanishing components of the tetrad-frame Weyl tensor Cklmn are − 21 Ctrtr =

1 2

Cθφθφ = Ctθtθ = Ctφtφ = − Crθrθ = − Crφrφ = Re C , 1 2

Ctrθφ = Ctθrφ = − Ctφrθ = Im C ,

(15.36a) (15.36b)

where C is the complex Weyl scalar C =−

1 (r − ia cos θ)3

 M−

Q2 r + ia cos θ



.

(15.37)

In the Boyer-Lindquist tetrad, the photon 4-velocity v m on the principal null congruences is radial, ρ ρ vt = ± √ , vr = ± √ , vθ = 0 , vφ = 0 . (15.38) ∆ ∆ Exercise 15.3 Dragging of inertial frames around a Kerr-Newman black hole. What is the coordinate-frame 4-velocity uµ of the Boyer-Lindquist tetrad through the Boyer-Lindquist coordinates?

15.4 Doran waterfall The picture of space falling into a black hole like a river or waterfall works also for rotating black holes. For Kerr-Newman rotating black holes, the counterpart of the Gullstrand-Painlev´e metric is the Doran (2000) metric.

260

Black hole waterfalls

The space river that falls into a rotating black hole has a twist. One might have expected that the rotation of the black hole would be manifested by a velocity that spirals inward, but this is not the case. Instead, the river is characterized not merely by a velocity but also by a twist. The velocity and the twist together comprise a 6-dimensional river bivector ωkm , equation (15.52) below, whose electric part is the velocity, and whose magnetic part is the twist. Recall that the 6-dimensional group of Lorentz transformations is generated by a combination of 3-dimensional Lorentz boosts and 3-dimensional spatial rotations. A fishy that swims through the river is Lorentz boosted by tidal changes in the velocity, and rotated by tidal changes in the twist, equation (15.61). Thanks to the twist, unlike the Gullstrand-Painlev´e metric, the Doran metric is not spatially flat at constant free-fall time tff . Rather, the spatial metric is sheared in the azimuthal direction. Just as the velocity produces a Lorentz boost that makes the metric non-flat with respect to the time components, so also the twist produces a rotation that makes the metric non-flat with respect to the spatial components.

15.4.1 Doran-Cartesian coordinates In place of the polar coordinates {r, θ, φff } of the Doran metric, introduce corresponding Doran-Cartesian coordinates {x, y, z} with z taken along the rotation axis of the black hole (the black hole rotates righthandedly about z, for positive spin parameter a) x ≡ R sin θ cos φff ,

y ≡ R sin θ sin φff ,

z ≡ r cos θ .

(15.39)

The metric in Doran-Cartesian coordinates xµ ≡ {tff , xi } ≡ {tff , x, y, z}, is ds2 = − dt2ff + δij dxi − β i ακ dxκ where αµ is the rotational velocity vector

and β µ is the velocity vector



dxj − β j αλ dxλ



n ay ax o αµ = 1, 2 , − 2 , 0 , R R βµ =

βR ρ

  xr yr zR 0, . , , Rρ Rρ rρ

(15.40)

(15.41)

(15.42)

The rotational velocity and radial velocity vectors are orthogonal αµ β µ = 0 . For the Kerr-Newman metric, the radial velocity β is p 2M r − Q2 β=∓ R

(15.43)

(15.44)

with − for black hole (infalling), + for white hole (outfalling) solutions. Horizons occur where β = ∓1 ,

(15.45)

15.4 Doran waterfall

261

with β = −1 for black hole horizons, and β = 1 for white hole horizons. Note that the squared magnitude βµ β µ of the velocity vector is not β 2 , but rather differs from β 2 by a factor of R2 /ρ2 : βµ β µ = βm β m =

β 2 R2 . ρ2

(15.46)

The point of the convention adopted here is that β(r) is any and only a function of r, rather than depending also on θ through ρ. Moreover, with the convention here, β is ∓1 at horizons, equation (15.45). Finally, the 4-velocity β µ is simply related to β by β µ = (β/r) ∂r/∂xµ . The Doran-Cartesian metric (15.40) encodes a vierbein em µ and inverse vierbein em µ µ e m µ = δm + αm β µ ,

em µ = δµm − αµ β m .

(15.47)

Here the tetrad-frame components αm of the rotational velocity vector and β m of the radial velocity vector are µ αµ , αm = em µ αµ = δm

β m = em µ β µ = δµm β µ ,

(15.48)

which works thanks to the orthogonality (15.43) of αµ and β µ . Equation (15.48) says that the covariant tetrad-frame components of the rotational velocity vector α are the same as its covariant coordinate-frame components in the Doran-Cartesian coordinate system, αm = αµ , and likewise the contravariant tetrad-frame components of the radial velocity vector β are the same as its contravariant coordinate-frame components, βm = βµ.

15.4.2 Doran-Cartesian tetrad Like the Gullstrand-Painlev´e tetrad, the Doran-Cartesian tetrad γm ≡ {γ γ0 , γ1 , γ2 , γ3 } is aligned with the Cartesian rest frame gµ ≡ {gtff , gx , gy , gz } at infinity, and is parallel-transported, without precessing, by observers who free-fall from zero velocity and zero angular momentum at infinity, as can be confirmed by checking that the tetrad connections with final index 0 all vanish, Γnm0 = 0, equation (15.14). Let k and ⊥ subscripts denote horizontal radial and azimuthal directions respectively, so that γk ≡ cos φff γ1 + sin φff γ2 , γ⊥ ≡ − sin φff γ1 + cos φff γ2 , gk ≡ cos φff gx + sin φff gy ,

g⊥ ≡ − sin φff gx + cos φff gy .

(15.49)

Then the relation between Doran-Cartesian tetrad axes γm and the tangent axes gµ of the Doran-Cartesian metric (15.40) is γ0 = gtff + β i gi ,

(15.50a)

γk = gk ,

(15.50b)

a sin θ i β gi , γ⊥ = g⊥ − R γ3 = gz .

(15.50c) (15.50d)

The relations (15.50) resemble those (15.19) of the Gullstrand-Painlev´e tetrad, except that the azimuthal

262

Black hole waterfalls

tetrad axis γ⊥ is shifted radially relative to the azimuthal tangent axis g⊥ . This shift reflects the fact that, unlike the Gullstrand-Painlev´e metric, the Doran metric is not spatially flat at constant free-fall time, but rather is sheared azimuthally.

15.4.3 Doran fishies The tetrad-frame connections equal the ordinary coordinate partial derivatives in Doran-Cartesian coordinates of a bivector (antisymmetric tensor) ωkm Γkmn = −

∂ωkm , ∂xn

(15.51)

which I call the river field because it encapsulates all the properties of the infalling river of space. The bivector river field ωkm is ωkm = αk βm − αm βk − ε0kmi ζ i ,

(15.52)

where βm = ηmn β m , the totally antisymmetric tensor εklmn is normalized so that ε0123 = −1, and the vector ζ i points vertically upward along the rotation axis of the black hole Z r β dr . (15.53) ζ i ≡ {0, 0, 0, ζ} , ζ ≡ a 2 ∞ R The electric part of ωkm , where one of the indices is time 0, constitutes the velocity vector β i ω0i = β i

(15.54)

while the magnetic part of ωkm , where both indices are spatial, constitutes the twist vector µi defined by µi ≡

1 2

ε0ikm ωkm = ε0ikm αk βm + ζ i .

(15.55)

The sense of the twist is that induces a right-handed rotation about an axis equal to the direction of µi by an angle equal to the magnitude of µi . In 3-vector notation, with µ ≡ µi , α ≡ αi , β ≡ β i , ζ ≡ ζ i , µ ≡ α×β+ζ . In terms of the velocity and twist vectors, the river field  0 βx  −β x 0 ωkm =   −β y −µz −β z µy

(15.56)

ωkm is βy µz 0 −µx

 βz −µy   . µx  0

(15.57)

Note that the sign of the magnetic part β of ωkm is opposite to the sign of the analogous magnetic field B associated with an electromagnetic field Fkm ; but the adopted signs are natural in that the river field

263

Rotation axis

15.4 Doran waterfall

O u ter h o riz o n

Rotation axis

360°

I n n er h o riz o n

O u ter h o riz o n

I n n er h o riz o n

Figure 15.3 (Upper panel) velocity β i and (lower panel) twist µi vector fields for a Kerr black hole with spin parameter a = 0.96. Both vectors lie, as shown, in the plane of constant free-fall azimuthal angle φff . The vertical bar in the lower panel shows the length of a twist vector corresponding to a full rotation of 360◦ .

induces boosts in the direction of the velocity β i , and right-handed rotations about the twist µi . Like a static electric field, the velocity vector β i is the gradient of a potential Z r ∂ i β = β dr , (15.58) ∂xi but unlike a magnetic field the twist vector µi is not pure curl: rather, it is µi + ζ i that is pure curl. Figure 15.3 illustrates the velocity and twist fields in a Kerr black hole. With the tetrad connection coefficients given by equation (15.51), the equation of motion (15.21) for a

264

Black hole waterfalls

4-vector pk attached to a fishy following a geodesic in the Doran river translates to dpk ∂ω k m n m u p . (15.59) = dτ ∂xn In a proper time δτ , the fishy moves a proper distance δξ m ≡ um δτ relative to the background DoranCartesian coordinates. As a result, the fishy sees a tidal change δω k m in the river field δω k m = δξ n

∂ω k m . ∂xn

(15.60)

Consequently the 4-vector pk is changed by pk → pk + δω k m pm .

(15.61)

But equation (15.61) corresponds to an infinitesimal Lorentz transformation by δω k m , equivalent to a Lorentz boost by δβ i and a rotation by δµi . As discussed previously with regard to the Gullstrand-Painlev´e river, §15.2.3, the tidal change δω k m , equation (15.60), in the river field seen by a fishy is not the full change δxν ∂ω k m /∂xν relative to the background coordinates, but rather the change relative to the river  ν  ∂ω k m ∂ω k m 2 ν = δx − β (δt − a sin θ δφ ) , (15.62) ff ff ∂xν ∂xν with the change in the velocity and twist of the river itself subtracted off. That there exists a tetrad (the Doran-Cartesian tetrad) where the tetrad-frame connections are a coordinate gradient of a bivector, equation (15.51), is a peculiar feature of ideal black holes. It is an intriguing thought that perhaps the 6 physical degrees of freedom of a general spacetime might always be encoded in the 6 degrees of freedom of a bivector, but I suspect that that is not true. δω k m = (δxν − δxνriver )

Exercise 15.4 line-element

River model of the Friedmann-Robertson-Walker metric. Show that the flat FRW ds2 = − dt2 + a2 (dx2 + x2 do2 )

(15.63)

ds2 = − dt2 + (dr − Hr dt)2 + r2 do2 ,

(15.64)

can be re-expressed as

where r ≡ ax is the proper radial distance, and H ≡ a/a ˙ is the Hubble parameter. Interpret the lineelement (15.64). What is the generalization to a non-flat FRW universe? Exercise 15.5 Program geodesics in a rotating black hole. Write a graphics (Java?) program that uses the prescription (15.60) to draw geodesics of test particles in an ideal (Kerr-Newman) black hole, expressed in Doran-Cartesian coordinates. Attach 3D bodies to your test particles, and use the same prescription (15.60) to rotate the bodies. Implement an option to translate to Boyer-Lindquist coordinates.

16 General spherically symmetric spacetime

16.1 Spherical spacetime The most important equations in this chapter are the two Einstein equations (16.52). Spherical spacetimes have 2 physical degrees of freedom. Spherical symmetry eliminates any angular degrees of freedom, leaving 4 adjustable metric coefficients gtt , gtr , grr , and gθθ . But coordinate transformations of the time t and radial r coordinates remove 2 degrees of freedom, leaving a spherical spacetime with a net 2 physical degrees of freedom. Spherical spacetimes have 4 distinct Einstein equations (16.30). But 2 of the Einstein equations serve to enforce energy-momentum conservation, so the evolution of the spacetime is governed by 2 Einstein equations, in agreement with the number of physical degrees of freedom of spherical spacetime. The 2 degrees of freedom mean that spherical spacetimes in general relativity have a richer structure than in Newtonian gravity, which has only degree of freedom, the Newtonian potential Φ. The richer structure is most striking in the case of the mass inflation instability, Chapter 17, which is an intrinsically general relativistic instability, with no Newtonian analogue.

16.1.1 Spherical line-element The spherical line-element adopted in this chapter is, in spherical polar coordinates xµ ≡ {t, r, θ, φ}, ds2 = −

dt2 1 + 2 2 α βr

 2 dt dr − βt + r2 do2 . α

(16.1)

Here r is the circumferential radius, defined such that the circumference around any great circle is 2πr. The line-element (16.1) is somewhat unconventional in that it is not diagonal: gtr does not vanish. There are two good reasons to consider a non-diagonal metric. Firstly, as discussed in §16.1.11, Einstein’s equations take a more insightful form when expressed in a non-diagonal frame where βt does not vanish. Secondly, if a horizon is present, as in the case of black holes, and if the radial coordinate is taken to be the circumferential radius r, then a diagonal metric will have a coordinate singularity at the horizon, which is not ideal.

266

General spherically symmetric spacetime

The vierbein em µ and inverse vierbein em µ corresponding to the spherical line-element    α βt 0 0 1/α 0 0 0  0 βr  − βt /(αβr ) 1/βr 0  0 0 0 m µ  , e µ= em =   0 0 1/r   0 0 0 r 0 0 0 0 1/(r sin θ) 0 0 0 r sin θ

(16.1) are    . 

(16.2)

As in the ADM formalism, Chapter 13, the tetrad time axis γt is chosen to be orthogonal to hypersurfaces of constant time t. However, the convention here for the vierbein coefficients differs from the ADM convention: here 1/α is the ADM lapse, while βt /α is the ADM shift. The directed derivatives ∂t and ∂r along the time and radial tetrad axes γt and γr are ∂t = et µ

∂ ∂ ∂ = α + βt , ∂xµ ∂t ∂r

∂r = er µ

∂ ∂ = βr . ∂xµ ∂r

(16.3)

The tetrad-frame 4-velocity um of a person at rest in the tetrad frame is by definition um = {1, 0, 0, 0}. It follows that the coordinate 4-velocity uµ of such a person is uµ = em µ um = et µ = {α, βt , 0, 0} .

(16.4)

A person instantaneously at rest in the tetrad frame satisfies dr/dt = βt /α according to equation (16.4), so it follows from the line-element (16.1) that the proper time τ of a person at rest in the tetrad frame is related to the coordinate time t by dτ =

dt α

in tetrad rest frame .

(16.5)

The directed time derivative ∂t is just the proper time derivative along the worldline of a person continuously at rest in the tetrad frame (and who is therefore not in free-fall, but accelerating with the tetrad frame), which follows from ∂ d dxµ ∂ = uµ µ = um ∂m = ∂t . (16.6) = dτ dτ ∂xµ ∂x By contrast, the proper time derivative measured by a person who is instantaneously at rest in the tetrad frame, but is in free-fall, is the covariant time derivative D dxµ = Dµ = u µ Dµ = u m Dm = Dt . Dτ dτ

(16.7)

Since the coordinate radius r has been defined to be the circumferential radius, a gauge-invariant definition, it follows that the tetrad-frame gradient ∂m of the coordinate radius r is a tetrad-frame 4-vector (a coordinate gauge-invariant object) ∂m r = em µ

∂r = em r = βm = {βt , βr , 0, 0} is a tetrad 4-vector . ∂xµ

(16.8)

This accounts for the notation βt and βr introduced above. Since βm is a tetrad 4-vector, its scalar product

16.1 Spherical spacetime

267

with itself must be a scalar. This scalar defines the interior mass M (t, r), also called the Misner-Sharp mass, by 1−

2M ≡ − βm β m = − βt2 + βr2 r

is a coordinate and tetrad scalar .

(16.9)

The interpretation of M as the interior mass will become evident below, §16.1.8.

16.1.2 Rest diagonal line-element Although this is not the choice adopted here, the line-element (16.1) can always be brought to diagonal form by a coordinate transformation t → t× (subscripted × for diagonal) of the time coordinate. The t–r part of the metric is gtt dt2 + 2 gtr dt dr + grr dr2 =

 1  2 (gtt dt + gtr dr)2 + (gtt grr − gtr ) dr2 . gtt

(16.10)

This can be diagonalized by choosing the time coordinate t× such that f dt× = gtt dt + gtr dr

(16.11)

for some integrating factor f (t, r). Equation (16.11) can be solved by choosing t× to be constant along integral curves dr gtt . =− dt gtr

(16.12)

The resulting diagonal rest line-element is ds2 = −

dt2× dr2 + + r2 do2 . α2× 1 − 2M/r

(16.13)

The line-element (16.13) corresponds physically to the case where the tetrad frame is taken to be at rest in the spatial coordinates, βt = 0, as can be seen by comparing it to the earlier line-element (16.1). In changing the tetrad frame from one moving at dr/dt = βt /α to one that is at rest (at constant circumferential radius r), a tetrad transformation has in effect been done at the same time as the coordinate transformation (16.11), the tetrad transformation being precisely that needed to make the line-element (16.13) diagonal. The metric coefficient grr in the line-element (16.13) follows from the fact that βr2 = 1 − 2M/r when βt = 0, equation (16.9). The transformed time coordinate t× is unspecified up to a transformation t× → f (t× ). If the spacetime is asymptotically flat at infinity, then a natural way to fix the transformation is to choose t× to be the proper time at rest at infinity.

268

General spherically symmetric spacetime

16.1.3 Comoving diagonal line-element Although once again this is not the path followed here, the line-element (16.1) can also be brought to diagonal form by a coordinate transformation r → r× , where, analogously to equation (16.11), r× is chosen to satisfy   1 dt f dr× = gtr dt + grr dr ≡ (16.14) dr − βt βr α for some integrating factor f (t, r). The new coordinate r× is constant along the worldline of an object at rest in the tetrad frame, with dr/dt = βt /α, equation (16.4), so r× can be regarded as a comoving radial coordinate. The comoving radial coordinate r× could for example be chosen to equal the circumferential radius r at some fixed instant of coordinate time t (say t = 0). The diagonal comoving line-element in this comoving coordinate system takes the form ds2 = −

dr2 dt2 + 2× + r2 do2 , 2 α λ

(16.15)

where the circumferential radius r(t, r× ) is considered to be an implicit function of time t and the comoving radial coordinate r× . Whereas in the rest line-element (16.13) the tetrad was changed from one that was moving at dr/dt = βt /α to one that was at rest, here the transformation keeps the tetrad unchanged. In both the rest and comoving diagonal line-elements (16.13) and (16.15) the tetrad is at rest relative to the respective radial coordinate r or r× ; but whereas in the rest line-element (16.13) the radial coordinate was fixed to be the circumferential radius r, in the comoving line-element (16.15) the comoving radial coordinate r× is a label that follows the tetrad. Because the tetrad is unchanged by the transformation to the comoving radial coordinate r× , the directed time and radial derivatives are unchanged: ∂ ∂ ∂ ∂ ∂ = β , ∂ = λ . (16.16) ∂t = α = α + βt r r ∂t r ∂t r ∂r t ∂r× t ∂r t ×

16.1.4 Tetrad connections Now turn the handle to proceed towards the Einstein equations. The tetrad connections coefficients Γkmn corresponding to the spherical line-element (16.1) are Γrtt = ht , Γrtr = hr , βt Γθtθ = Γφtφ = , r βr , Γθrθ = Γφrφ = r cot θ , Γφθφ = r

(16.17a) (16.17b) (16.17c) (16.17d) (16.17e)

16.1 Spherical spacetime

269

where ht is the proper radial acceleration (minus the gravitational force) experienced by a person at rest in the tetrad frame ht ≡ − ∂r ln α ,

(16.18)

and hr is the “Hubble parameter” of the radial flow, as measured in the tetrad rest frame, defined by h r ≡ − βt

∂ ln α ∂βt + − ∂t ln βr . ∂r ∂r

(16.19)

The interpretation of ht as a proper acceleration and hr as a radial Hubble parameter goes as follows. The tetrad-frame 4-velocity um of a person at rest in the tetrad frame is by definition um = {1, 0, 0, 0}. If the person at rest were in free fall, then the proper acceleration would be zero, but because this is a general spherical spacetime, the tetrad frame is not necessarily in free fall. The proper acceleration experienced by a person continuously at rest in the tetrad frame is the proper time derivative Dum /Dτ of the 4-velocity, which is Dum t m r = Dt um = ∂t um + Γm tt u = Γtt = {0, Γtt , 0, 0} = {0, ht , 0, 0} , Dτ

(16.20)

the first step of which follows from equation (16.7). Similarly, a person at rest in the tetrad frame will measure the 4-velocity of an adjacent person at rest in the tetrad frame a small proper radial distance δξ r away to differ by δξ r Dr um . The Hubble parameter of the radial flow is thus the covariant radial derivative Dr um , which is t m r Dr um = ∂r um + Γm tr u = Γtr = {0, Γtr , 0, 0} = {0, hr , 0, 0} .

(16.21)

Confined to the t–r-plane (that is, considering only Lorentz transformations in the t–r-plane, which is to say radial Lorentz boosts), the acceleration ht and Hubble parameter hr constitute the components of a tetrad-frame 2-vector hn = {ht , hr }: hn = Γrtn .

(16.22)

The Riemann tensor, equations (16.24) below, involves covariant derivatives Dm hn of hn , which coincide with the covariant derivatives D(2) hn confined to t–r-plane. Since hr is a kind of radial Hubble parameter, it can be useful to define a corresponding radial scale factor λ by hr ≡ −∂t ln λ .

(16.23)

The scale factor λ is the same as the λ in the comoving line-element of equation (16.15). This is true because hr is a tetrad connection and therefore coordinate gauge-invariant, and the line-element (16.15) is related to the line-element (16.1) being considered by a coordinate transformation r → r× that leaves the tetrad unchanged.

270

General spherically symmetric spacetime

16.1.5 Riemann, Einstein, and Weyl tensors The non-vanishing components of the tetrad-frame Riemann tensor Rklmn corresponding to the spherical line-element (16.1) are Rtrtr = Dr ht − Dt hr , 1 Rtθtθ = Rtφtφ = − Dt βt , r 1 Rrθrθ = Rrφrφ = − Dr βr , r 1 1 Rtθrθ = Rtφrφ = − Dt βr = − Dr βt , r r 2M Rθφθφ = 3 , r where Dm denotes the covariant derivative as usual. The non-vanishing components of the tetrad-frame Einstein tensor Gkm are Gtt = 2 Rrθrθ + Rθφθφ , rr

G

tr

G θθ

G

φφ

=G

(16.24a) (16.24b) (16.24c) (16.24d) (16.24e)

(16.25a)

= 2 Rtθtθ − Rθφθφ ,

(16.25b)

= Rtrtr + Rtθtθ − Rrθrθ ,

(16.25d)

= − 2 Rtθrθ ,

(16.25c)

whence   M 2 − D r βr + 2 , r r   M 2 − D t βt − 2 , Grr = r r 2 2 Gtr = Dt βr = Dr βt , r r 1 θθ φφ G = G = Dr ht − Dt hr + (Dr βr − Dt βt ) . r The non-vanishing components of the tetrad-frame Weyl tensor Cklmn are Gtt =

1 2

Ctrtr = − Ctθtθ = − Ctφtφ = Crθrθ = Crφrφ = − 21 Cθφθφ = C ,

(16.26a) (16.26b) (16.26c) (16.26d)

(16.27)

where C is the Weyl scalar (the spin-0 component of the Weyl tensor), C≡

 M 1 1 Gtt − Grr + Gθθ − 3 . (Rtrtr − Rtθtθ + Rrθrθ − Rθφθφ ) = 6 6 r

(16.28)

16.1.6 Einstein equations The tetrad-frame Einstein equations Gkm = 8πT km

(16.29)

16.1 Spherical spacetime

271

imply that Gtt  Gtr   0 0 

Gtr Grr 0 0

0 0 Gθθ 0

  0  0   = 8πT km = 8π    0 φφ G

ρ f 0 0

f p 0 0

0 0 p⊥ 0

 0 0   0  p⊥

(16.30)

where ρ ≡ T tt is the proper energy density, f ≡ T tr is the proper radial energy flux, p ≡ T rr is the proper radial pressure, and p⊥ ≡ T θθ = T φφ is the proper transverse pressure.

16.1.7 Choose your frame So far the radial motion of the tetrad frame has been left unspecified. Any arbitrary choice can be made. For example, the tetrad frame could be chosen to be at rest, βt = 0 ,

(16.31)

as in the Schwarzschild or Reissner-Nordstr¨ om line-elements. Alternatively, the tetrad frame could be chosen to be in free-fall, ht = 0 ,

(16.32)

as in the Gullstrand-Painlev´e line-element. For situations where the spacetime contains matter, one natural choice is the center-of-mass frame, defined to be the frame in which the energy flux f is zero Gtr = 8πf = 0 .

(16.33)

Whatever the choice of radial tetrad frame, tetrad-frame quantities in different radial tetrad frames are related to each other by a radial Lorentz boost.

16.1.8 Interior mass Equations (16.26a) with the middle expression of (16.26c), and (16.26b) with the final expression of (16.26c), respectively, along with the definition (16.9) of the interior mass M , and the Einstein equations (16.30), imply   1 1 − ∂ M − β f , (16.34a) p= t r βt 4πr2   1 1 ρ= ∂ M − β f . (16.34b) r t βr 4πr2 In the center-of-mass frame, f = 0, these equations reduce to ∂t M = − 4πr2 βt p , 2

∂r M = 4πr βr ρ .

(16.35a) (16.35b)

272

General spherically symmetric spacetime

Equations (16.35) amply justify the interpretation of M as the interior mass. The first equation (16.35a) can be written ∂t M + p 4πr2 ∂t r = 0 ,

(16.36)

which can be recognized as an expression of the first law of thermodynamics, dE + p dV = 0 ,

(16.37)

with mass-energy E equal to M . The second equation (16.35b) can be written, since ∂r = βr ∂/∂r, equation (16.3), ∂M = 4πr2 ρ , (16.38) ∂r which looks exactly like the Newtonian relation between interior mass M and density ρ. Actually, this apparently Newtonian equation (16.38) is deceiving. The proper 3-volume element d3r in the center-of-mass tetrad frame is given by d3r γr ∧ γθ ∧ γφ = gr dr ∧ gθ dθ ∧ gφ dφ =

r2 sin θ dr dθ dφ γr ∧ γθ ∧ γφ , βr

(16.39)

so that the proper 3-volume element dV ≡ d3r of a radial shell of width dr is dV =

4πr2 dr . βr

(16.40)

Thus the “true” mass-energy dMm associated with the proper density ρ in a proper radial volume element dV might be expected to be dMm = ρ dV =

4πr2 dr , βr

(16.41)

whereas equation (16.38) indicates that the actual mass-energy is dM = ρ 4πr2 dr = βr ρ dV .

(16.42)

A person in the center-of-mass frame might perhaps, although there is really no formal justification for doing so, interpret the balance of the mass-energy as gravitational mass-energy Mg dMg = (βr − 1)ρ dV .

(16.43)

Whatever the case, the moral of this is that you should beware of interpreting the interior mass M too literally as palpable mass-energy.

16.1.9 Energy-momentum conservation Covariant conservation of the Einstein tensor Dm Gmn = 0 implies conservation of energy-momentum Dm T mn = 0. The two non-vanishing equations represent conservation of energy and of radial momentum,

16.1 Spherical spacetime

273

and are   2βr 2βt (ρ + p⊥ ) + hr (ρ + p) + ∂r + + 2 ht f = 0 , r r   2βt 2βr (p − p⊥ ) + ht (ρ + p) + ∂t + + 2 hr f = 0 . Dm T mr = ∂r p + r r In the center-of-mass frame, f = 0, these energy-momentum conservation equations reduce to Dm T mt = ∂t ρ +

2βt (ρ + p⊥ ) + hr (ρ + p) = 0 , r 2βr ∂r p + (p − p⊥ ) + ht (ρ + p) = 0 . r In a general situation where the mass-energy is the sum over several individual components a, X Tamn , T mn = ∂t ρ +

(16.44a) (16.44b)

(16.45a) (16.45b)

(16.46)

species a

the individual mass-energy components a of the system each satisfy an energy-momentum conservation equation of the form Dm Tamn = Fan ,

(16.47)

where Fan is the flux of energy into component a. Einstein’s equations enforce energy-momentum conservation of the system as a whole, so the sum of the energy fluxes must be zero X Fan = 0 . (16.48) species a

16.1.10 First law of thermodynamics For an individual species a, the energy conservation equation (16.44a) in the center-of-mass frame of the species, fa = 0, can be written Dm Tamt = ∂t ρa + (ρa + p⊥a )∂t ln r2 + (ρa + pa )∂t ln λa = Fat ,

(16.49)

where λa is the radial “scale factor,” equation (16.23), in the center-of-mass frame of the species (the scale factor is different in different frames). Equation (16.49) can be recognized as an expression of the first law of thermodynamics for a volume element V of species a, in the form h i V −1 ∂t (ρa V ) + p⊥a Vr ∂t V⊥ + pa V⊥ ∂t Vr = Fat , (16.50)

with transverse volume (area) V⊥ ∝ r2 , radial volume (width) Vr ∝ λa , and total volume V ∝ V⊥ Vr . The flux Fat on the right hand side is the heat per unit volume per unit time going into species a. If the pressure of species a is isotropic, p⊥a = pa , then equation (16.50) simplifies to h i (16.51) V −1 ∂t (ρa V ) + pa ∂t V = Fat , with volume V ∝ r2 λa .

274

General spherically symmetric spacetime

16.1.11 Structure of the Einstein equations The spherically symmetric spacetime under consideration is described by 3 vierbein coefficients, α, βt , and βr . However, some combination of the 3 coefficients represents a gauge freedom, since the spherically symmetric spacetime has only two physical degrees of freedom. As commented in §16.1.7, various gauge-fixing choices can be made, such as choosing to work in the center-of-mass frame, f = 0. Equations (16.26) give 4 equations for the 4 non-vanishing components of the Einstein tensor. The two expressions for Gtr are identical when expressed in terms of the vierbein and vierbein derivatives, so are not distinct equations. Conservation of energy-momentum of the system as a whole is built in to the Einstein equations, a consequence of the Bianchi identities, so 2 of the Einstein equations are effectively equivalent to the energy-momentum conservation equations (16.44). In the general case where the matter contains multiple components, it is usually a good idea to include the equations describing the conservation or exchange of energy-momentum separately for each component, so that global conservation of energy-momentum is then satisfied as a consequence of the matter equations. This leaves 2 independent Einstein equations to describe the 2 physical degrees of freedom of the spacetime. The 2 equations may be taken to be the evolution equations (16.26c) and (16.26b) for βt and βr , D t βt = −

M − 4πrp , r2

Dt βr = 4πrf ,

(16.52a) (16.52b)

which are valid for any choice of tetrad frame, not just the center-of-mass frame. Equations (16.52) are the most important of the general relativistic equations governing spherically symmetric spacetimes. It is these equations that are responsible (to the extent that equations may be considered responsible) for the strange internal structure of Reissner-Nordstr¨ om black holes, and for mass inflation. The coefficient βt equals the coordinate radial 4-velocity dr/dτ = ∂t r = βt of the tetrad frame, equation (16.4), and thus equation (16.52a) can be regarded as giving the proper radial acceleration D2 r/Dτ 2 = Dβt /Dτ = Dt βt of the tetrad frame as measured by a person who is in free-fall and instantaneously at rest in the tetrad frame. If the acceleration is measured by an observer who is continuously at rest in the tetrad frame (as opposed to being in free-fall), then the proper acceleration is ∂t βt = Dt βt + βr ht . The presence of the extra term βr ht , proportional to the proper acceleration ht actually experienced by the observer continuously at rest in the tetrad frame, reflects the principle of equivalence of gravity and acceleration. The right hand side of equation (16.52a) can be interpreted as the radial gravitational force, which consists of two terms. The first term, −M/r2 , looks like the familiar Newtonian gravitational force, which is attractive (negative, inward) in the usual case of positive mass M . The second term, −4πrp, proportional to the radial pressure p, is what makes spherical spacetimes in general relativity interesting. In a Reissner-Nordstr¨ om black hole, the negative radial pressure produced by the radial electric field produces a radial gravitational repulsion (positive, outward), according to equation (16.52a), and this repulsion dominates the gravitational force at small radii, producing an inner horizon. In mass inflation, the (positive) radial pressure of relativistically

16.1 Spherical spacetime

275

counter-streaming ingoing and outgoing streams just above the inner horizon dominates the gravitational force (inward), and it is this that drives mass inflation. Like the second half of a vaudeville act, the second Einstein equation (16.52b) also plays an indispensible role. The quantity βr ≡ ∂r r on the left hand side is the proper radial gradient of the circumferential radius r measured by a person at rest in the tetrad frame. The sign of βr determines which way an observer at rest in the tetrad frame thinks is “outwards”, the direction of larger circumferential radius r. A positive βr means that the observer thinks the outward direction points away from the black hole, while a negative βr means that the observer thinks the outward direction points towards from the black hole. Outside the outer horizon βr is necessarily positive, because βm must be spacelike there. But inside the horizon βr may be either positive or negative. A tetrad frame can be defined as “ingoing” if the proper radial gradient βr is positive, and “outgoing” if βr is negative. In the Reissner-Nordstr¨ om geometry, ingoing geodesics have positive energy, and outgoing geodesics have negative energy. However, the present definition of ingoing or outgoing based on the sign of βr is general – there is no need for a timelike Killing vector such as would be necessary to define the (conserved) energy of a geodesic. Equation (16.52b) shows that the proper rate of change Dt βr in the radial gradient βr measured by an observer who is in free-fall and instantaneously at rest in the tetrad frame is proportional to the radial energy flux f in that frame. But ingoing observers tend to see energy flux pointing away from the black hole, while outgoing observers tend to see energy flux pointing towards the black hole. Thus the change in βr tends to be in the same direction as βr , amplifying βr whatever its sign. Exercise 16.1

Birkhoff ’s theorem. Prove Birkhoff’s theorem from equations (16.52).

16.1.12 Comment on the vierbein coefficient α Whereas the Einstein equations (16.52) give evolution equations for the vierbein coefficients βt and βr , there is no evolution equation for the vierbein coefficient α. Indeed, the Einstein equations involve the vierbein coefficient α only in the combination ht ≡ −∂r ln α. This reflects the fact that, even after the tetrad frame is fixed, there is still a coordinate freedom t → t′ (t) in the choice of coordinate time t. Under such a gauge transformation, α transforms as α → α′ = f (t) α where f (t) = ∂t′ /∂t is an arbitrary function of coordinate time t. Only ht ≡ −∂r ln α is independent of this coordinate gauge freedom, and thus only ht , not α itself, appears in the tetrad-frame Einstein equations. Since α is needed to propagate the equations from one coordinate time to the next (because ∂t = α ∂/∂t + βt ∂/∂r), it is necessary to construct α by integrating ht ≡ −βr ∂ ln α/∂r along the radial direction r at each time step. The arbitrary normalization of α at each step might be fixed by choosing α to be unity at infinity, which corresponds to fixing the time coordinate t to equal the proper time at infinity. In the particular case that the tetrad frame is taken to be in free-fall everywhere, ht = 0, as in the Gullstrand-Painlev´e line-element, then α is constant at fixed t, and without loss of generality it can be fixed equal to unity everywhere, α = 1. I like to think of a free-fall frame as being realized physically by tracer

276

General spherically symmetric spacetime

“dark matter” particles that free-fall radially (from zero velocity, typically) at infinity, and stream freely, without interacting, through any actual matter that may be present.

16.2 Spherical electromagnetic field The internal structure of a charged black hole resembles that of a rotating black hole because the negative pressure (tension) of the radial electric field produces a gravitational repulsion analogous to the centrifugal repulsion in a rotating black hole. Since it is much easier to deal with spherical than rotating black holes, it is common to use charge as a surrogate for rotation in exploring black holes.

16.2.1 Electromagnetic field The assumption of spherical symmetry means that any electromagnetic field can consist only of a radial electric field (in the absence of magnetic monopoles). The only non-vanishing components of the electromagnetic field Fmn are then Q (16.53) Ftr = −Frt = E = 2 , r where E is the radial electric field, and Q(t, r) is the interior electric charge. Equation (16.53) can be regarded as defining what is meant by the electric charge Q interior to radius r at time t.

16.2.2 Maxwell’s equations A radial electric field automatically satisfies two of Maxwell’s equations, the source-free ones (11.64). For the radial electric field (16.53), the other two Maxwell’s equations, the sourced ones (11.65), are ∂r Q = 4πr2 q ,

(16.54a)

2

∂t Q = −4πr j ,

(16.54b)

where q ≡ j t is the proper electric charge density and j ≡ j r is the proper radial electric current density in the tetrad frame.

16.2.3 Electromagnetic energy-momentum tensor For the radial electric field (16.53), the electromagnetic energy-momentum tensor (11.70) in the tetrad frame is the diagonal tensor   1 0 0 0  Q2   0 −1 0 0  . (16.55) Temn = 8πr4  0 0 1 0  0

0

0

1

16.3 General relativistic stellar structure

277

The radial electric energy-momentum tensor is independent of the radial motion of the tetrad frame, which reflects the fact that the electric field is invariant under a radial Lorentz boost. The energy density ρe and radial and transverse pressures pe and p⊥e of the electromagnetic field are the same as those from a spherical charge distribution with interior electric charge Q in flat space ρe = −pe = p⊥e =

Q2 E2 = . 4 8πr 8π

(16.56)

The non-vanishing components of the covariant derivative Dm Temn of the electromagnetic energy-momentum (16.55) are jQ Q 4βt ∂t Q = − 2 = − jE , ρe = r 4πr4 r 4βr qQ Q = ∂r pe + ∂r Q = − 2 = − qE . pe = − r 4πr4 r

Dm Temt = ∂t ρe +

(16.57a)

Dm Temr

(16.57b)

The first expression (16.57a), which gives the rate of energy transfer out of the electromagnetic field as the current density j times the electric field E, is the same as in flat space. The second expression (16.57b), which gives the rate of transfer of radial momentum out of the electromagnetic field as the charge density q times the electric field E, is the Lorentz force on a charge density q, and again is the same as in flat space.

16.3 General relativistic stellar structure A star can be well approximated as static as well as spherically symmetric. In this case all time derivatives can be taken to vanish, ∂/∂t = 0, and, since the center-of-mass frame coincides with the rest frame, it is natural to choose the tetrad frame to be at rest, βt = 0. The Einstein equation (16.52b) then vanishes identically, while the Einstein equation (16.52a) becomes βr h t =

M + 4πrp , r2

(16.58)

which expresses the proper acceleration ht in the rest frame in terms of the familiar Newtonian gravitational force M/r2 plus a term 4πrp proportional to the radial pressure. The radial pressure p, if positive as is the usual case for a star, enhances the inward gravitational force, helping to destabilize the star. Because βt is zero, the interior mass M given by equation (16.9) reduces to 1 − 2M/r = βr2 .

(16.59)

When equations (16.58) and (16.59) are substituted into the momentum equation (16.44b), and if the pressure is taken to be isotropic, so p⊥ = p, the result is the Oppenheimer-Volkov equation for general relativistic hydrostatic equilibrium ∂p (ρ + p)(M + 4πr3 p) . =− ∂r r2 (1 − 2M/r)

(16.60)

278

General spherically symmetric spacetime

In the Newtonian limit p ≪ ρ and M ≪ r this goes over to (with units restored) ∂p GM = −ρ 2 , ∂r r

(16.61)

which is the usual Newtonian equation of spherically symmetric hydrostatic equilibrium.

16.4 Self-similar spherically symmetric spacetime Even with the assumption of spherical symmetry, it is by no means easy to solve the system of partial differential equations that comprise the Einstein equations coupled to mass-energy of various kinds. One way to simplify the system of equations, transforming them into ordinary differential equations, is to consider self-similar solutions.

16.4.1 Self-similarity The assumption of self-similarity (also known as homothety, if you can pronounce it) is the assumption that the system possesses conformal time translation invariance. This implies that there exists a conformal time coordinate η such that the geometry at any one time is conformally related to the geometry at any other time i h (c) (c) (c) (16.62) (x) dη 2 + 2 gηx (x) dη dx + gxx (x) dx2 + e2x do2 . ds2 = a(η)2 gηη (c)

Here the conformal metric coefficients gµν (x) are functions only of conformal radius x, not of conformal time η. The choice e2x of coefficient of do2 is a gauge choice of the conformal radius x, carefully chosen here so as to bring the self-similar line-element into a form (16.66) below that resembles as far as possible the spherical line-element (16.1). In place of the conformal factor a(η) it is convenient to work with the circumferential radius r r ≡ a(η)ex

(16.63)

which is to be considered as a function r(η, x) of the coordinates η and x. The circumferential radius r has a gauge-invariant meaning, whereas neither a(η) nor x are independently gauge-invariant. The conformal factor r has the dimensions of length. In self-similar solutions, all quantities are proportional to some power of r, and that power can be determined by dimensional analysis. Quantites that depend only on the conformal radial coordinate x, independent of the circumferential radius r, are called dimensionless. (c) The fact that dimensionless quantities such as the conformal metric coefficients gµν (x) are independent of conformal time η implies that the tangent vector gη , which by definition satisfies ∂ = gη · ∂ , ∂η

(16.64)

is a conformal Killing vector, also known as the homothetic vector. The tetrad-frame components of the

16.4 Self-similar spherically symmetric spacetime

279

conformal Killing vector gη defines the tetrad-frame conformal Killing 4-vector ξ m ∂ ≡ r ξ m ∂m , ∂η

(16.65)

in which the factor r is introduced so as to make ξ m dimensionless. The conformal Killing vector gη is the generator of the conformal time translation symmetry, and as such it is gauge-invariant (up to a global rescaling of conformal time, η → bη for some constant b). It follows that its dimensionless tetrad-frame components ξ m constitute a tetrad 4-vector (again, up to global rescaling of conformal time).

16.4.2 Self-similar line-element The self-similar line-element can be taken to have the same form as the spherical line-element (16.1), but with the dependence on the dimensionless conformal Killing vector ξ m made manifest:   1 2 ds2 = − r2 (ξ η dη)2 + 2 (dx + βx ξ x dη) + do2 . (16.66) βx The vierbein em µ and inverse vierbein  1/ξ η − βx ξ x /ξ η  1 0 βx em µ =  0 r 0 0 0

em µ corresponding to the self-similar line-element (16.66) are   η  0 0 ξ 0 0 0   x 1/βx 0 0 0 0   , em µ = r  ξ  . (16.67)   1 0 0 0 1 0  0 1/ sin θ 0 0 0 sin θ

It is straightforward to see that the coordinate time components of the inverse vierbein must be em η = r ξ m , since ∂/∂η = em η ∂m equals r ξ m ∂m , equation (16.65).

16.4.3 Tetrad-frame scalars and vectors Since the conformal factor r is gauge-invariant, the directed gradient ∂m r constitutes a tetrad-frame 4-vector βm (which unlike ξ m is independent of any global rescaling of conformal time) βm ≡ ∂m r .

(16.68)

It is straightforward to check that βx defined by equation (16.68) is consistent with its appearance in the vierbein (16.67) provided that r ∝ ex as earlier assumed, equation (16.63). With two distinct dimensionless tetrad 4-vectors in hand, βm and the conformal Killing vector ξ m , three gauge-invariant dimensionless scalars can be constructed, β m βm , ξ m βm , and ξ m ξm , 1−

2M = − β m βm = − βη2 + βx2 , r

v ≡ ξ m βm =

1 ∂a 1 ∂r = , r ∂η a ∂η

(16.69) (16.70)

280

General spherically symmetric spacetime ∆ ≡ − ξ m ξm = (ξ η )2 − (ξ x )2 .

(16.71)

Equation (16.69) is essentially the same as equation (16.9). The dimensionless quantity v, equation (16.70), may be interpreted as a measure of the expansion velocity of the self-similar spacetime. Equation (16.70) shows that v is a function only of η (since a(η) is a function only of η), and it therefore follows that v must be constant (since being dimensionless means that v must be a function only of x). Equation (16.70) then also implies that the conformal factor a(η) must take the form a(η) = evη .

(16.72)

Because of the freedom of a global rescaling of conformal time, it is possible to set v = 1 without loss of generality, but in practice it is convenient to keep v, because it is then transparent how to take the static limit v → 0. Equation (16.72) along with (16.63) shows that the circumferential radius r is related to the conformal coordinates η and x by r = evη+x .

(16.73)

The dimensionsless quantity ∆, equation (16.71), is the dimensionless horizon function: horizons occur where the horizon function vanishes ∆=0

at horizons .

(16.74)

16.4.4 Self-similar diagonal line-element The self-similar line-element (16.66) can be brought to diagonal form by a coordinate transformation to diagonal conformal coordinates η× , x× (subscripted × for diagonal) η → η× = η + f (x) ,

x → x× = x − vf (x) ,

which leaves unchanged the conformal factor r, equation (16.73). The resulting diagonal metric is   dx2× 2 2 . + do + ds2 = r2 − ∆ dη× 1 − 2M/r + v2 /∆

(16.75)

(16.76)

The diagonal line-element (16.76) corresponds physically to the case where the tetrad frame is at rest in the similarity frame, ξ x = 0, as can be seen by comparing it to the line-element (16.66). The frame can be called the similarity frame. The form of the metric coefficients follows from the line-element (16.66) and the gauge-invariant scalars (16.69)–(16.71). The conformal Killing vector in the similarity frame is ξ m = {∆1/2 , 0, 0, 0}, and the 4-velocity of the similarity frame in its own frame is um = {1, 0, 0, 0}. Since both are tetrad 4-vectors, it follows that with respect to a general tetrad frame ξ m = um ∆1/2

(16.77)

where um is the 4-velocity of the similarity frame with respect to the general frame. This shows that the conformal Killing vector ξ m in a general tetrad frame is proportional to the 4-velocity of the similarity frame

16.4 Self-similar spherically symmetric spacetime

281

through the tetrad frame. In particular, the proper 3-velocity of the similarity frame through the tetrad frame is ξx proper 3-velocity of similarity frame through tetrad frame = η . (16.78) ξ

16.4.5 Ray-tracing line-element It proves useful to introduce a “ray-tracing” conformal radial coordinate X related to the coordinate x× of the diagonal line-element (16.76) by dX ≡

∆ dx×

1/2

[(1 − 2M/r)∆ + v2 ]

.

In terms of the ray-tracing coordinate X, the diagonal metric is   dX 2 2 + ds2 = r2 − ∆ dη× + do2 . ∆

(16.79)

(16.80)

For the Reissner-Nordstr¨ om geometry, ∆ = (1 − 2M/r)/r2 , η× = t, and X = −1/r.

16.4.6 Geodesics Spherical symmetry and conformal time translation symmetry imply that geodesic motion in spherically symmetric self-similar spacetimes is described by a complete set of integrals of motion. The integral of motion associated with conformal time translation symmetry can be obtained from Lagrange’s equations of motion ∂L d ∂L = , (16.81) η dτ ∂u ∂η with effective Lagrangian L = gµν uµ uν for a particle with 4-velocity uµ . The self-similar metric depends on the conformal time η only through the overall conformal factor gµν ∝ a(η)2 . The derivative of the conformal factor is given by ∂ ln a/∂η = v, equation (16.70), so it follows that ∂L/∂η = 2 v L. For a massive particle, for which conservation of rest mass implies gµν uµ uν = −1, Lagrange’s equations (16.81) thus yield duη = −v . (16.82) dτ In the limit of zero accretion rate, v → 0, equation (16.82) would integrate to give uη as a constant, the energy per unit mass of the geodesic. But here there is conformal time translation symmetry in place of time translation symmetry, and equation (16.82) integrates to uη = − v τ ,

(16.83)

in which an arbitrary constant of integration has been absorbed into a shift in the zero point of the proper time τ . Although the above derivation was for a massive particle, it holds also for a massless particle, with the understanding that the proper time τ is constant along a null geodesic. The quantity uη in equation (16.83)

282

General spherically symmetric spacetime

is the covariant time component of the coordinate-frame 4-velocity uµ of the particle; it is related to the covariant components um of the tetrad-frame 4-velocity of the particle by u η = em η u m = r ξ m u m .

(16.84)

Without loss of generality, geodesic motion can be taken to lie in the equatorial plane θ = π/2 of the spherical spacetime. The integrals of motion associated with conformal time translation symmetry, rotational symmetry about the polar axis, and conservation of rest mass, are, for a massive particle uη = − v τ ,

uφ = Lz ,

uµ uµ = −1 ,

(16.85)

where Lz is the orbital angular momentum per unit rest mass of the particle. The coordinate 4-velocity uµ ≡ dxµ /dτ that follows from equations (16.85) takes its simplest form in the conformal coordinates {η× , X, θ, φ} of the ray-tracing metric (16.80) uη× =

vτ , r2 ∆

uX = ±

1/2 1  2 2 v τ − (r2 + L2z )∆ , r2

uφ =

Lz . r2

(16.86)

16.4.7 Null geodesics The important case of a massless particle follows from taking the limit of a massive particle with infinite energy and angular momentum, v τ → ∞ and Lz → ∞. To obtain finite results, define an affine parameter λ by CHECK dλ ≡ v τ dτ , and a 4-velocity in terms of it by v µ ≡ dxµ /dλ. The integrals of motion (16.85) then become, for a null geodesic, vη× = −1 ,

vφ = Jz ,

vµ v µ = 0 ,

(16.87)

where Jz ≡ Lz /(v τ ) is the (dimensionless) conformal angular momentum of the particle. The 4-velocity v µ along the null geodesic is then, in terms of the coordinates of the ray-tracing metric (16.80), 1/2 1 1 − Jz2 ∆ , 2 r Equations (16.88) yield the shape of a null geodesic by quadrature Z Jz dX φ= . (1 − Jz2 ∆)1/2 vη =

1

r2 ∆

,

vX = ±

vφ =

Jz . r2

(16.88)

(16.89)

Equation (16.89) shows that the shape of null geodesics in spherically symmetric self-similar spacetimes hinges on the behavior of the dimensionless horizon function ∆(X) as a function of the dimensionless raytracing variable X. Null geodesics go through periapsis or apoapsis in the self-similar frame where the denominator of the integrand of (16.89) is zero, corresponding to v X = 0. In the Reissner-Nordstr¨ om geometry there is a radius, the photon sphere, where photons can orbit in circles for ever. In non-stationary self-similar solutions there is no conformal radius where photons can orbit for ever (to remain at fixed conformal radius, the photon angular momentum would have to increase in proportion to the conformal factor r). There is however a separatrix between null geodesics that do or do not fall into the black hole, and the conformal radius where this occurs can be called the photon sphere

16.4 Self-similar spherically symmetric spacetime

283

equivalent. The photon sphere equivalent occurs where the denominator of the integrand of equation (16.89) not only vanishes, v X = 0, but is an extremum, which happens where the horizon function ∆ is an extremum, d∆ = 0 at photon sphere equivalent . dX

(16.90)

16.4.8 Dimensional analysis Dimensional analysis shows that the conformal coordinates xµ ≡ {η, x, θ, φ} and the tetrad metric γmn are dimensionless, while the coordinate metric gµν scales as r2 , xµ ∝ r0 ,

γmn ∝ r0 ,

gµν ∝ r2 .

(16.91)

The vierbein em µ and inverse vierbein em µ , equations (16.67), scale as em µ ∝ r−1 ,

em µ ∝ r .

(16.92)

Coordinate derivatives ∂/∂xµ are dimensionless, while directed derivatives ∂m scale as 1/r,

The tetrad connections Γkmn

∂ ∝ r0 , ∂m ∝ r−1 . ∂xµ and the tetrad-frame Riemann tensor Rklmn scale as Γkmn ∝ r−1 ,

Rklmn ∝ r−2 .

(16.93)

(16.94)

16.4.9 Variety of self-similar solutions Self-similar solutions exist provided that the properties of the energy-momentum introduce no additional dimensional parameters. For example, the pressure-to-density ratio w ≡ p/ρ of any species is dimensionless, and since the ratio can depend only on the nature of the species itself, not for example on where it happens to be located in the spacetime, it follows that the ratio w must be a constant. It is legitimate for the pressure-to-density ratio to be different in the radial and transverse directions (as it is for a radial electric field), but otherwise self-similarity requires that w ≡ p/ρ ,

w⊥ ≡ p⊥ /ρ ,

(16.95)

be constants for each species. For example, w = 1 for an ultrahard fluid (which can mimic the behaviour of a massless scalar field: E. Babichev, S. Chernov, V. Dokuchaev, Yu. Eroshenko, 2008, “Ultra-hard fluid and scalar field in the Kerr-Newman metric,” Phys. Rev. D 78, 104027, arXiv:0807.0449), w = 1/3 for a relativistic fluid, w = 0 for pressureless cold dark matter, w = −1 for vacuum energy, and w = −1 with w⊥ = 1 for a radial electric field. Self-similarity allows that the energy-momentum may consist of several distinct components, such as a relativistic fluid, plus dark matter, plus an electric field. The components may interact with each other provided that the properties of the interaction introduce no additional dimensional parameters. For example, the relativistic fluid (and the dark matter) may be charged, and if so then the charged fluid will experience

284

General spherically symmetric spacetime

a Lorentz force from the electric field, and will therefore exchange momentum with the electric field. If the fluid is non-conducting, then there is no dissipation, and the interaction between the charged fluid and electric field automatically introduces no additional dimensional parameters. However, if the charged fluid is electrically conducting, then the electrical conductivity could potentially introduce an additional dimensional parameter, and this must not be allowed if self-similarity is to be maintained. In diffusive electrical conduction in a fluid of conductivity σ, an electric field E gives rise to a current j = σE ,

(16.96)

which is just Ohm’s law. Dimensional analysis shows that j ∝ r−2 and E ∝ r−1 , so the conductivity must scale as σ ∝ r−1 . The conductivity can depend only on the intrinsic properties of the conducting fluid, and the only intrinsic property available is its density, which scales as ρ ∝ r−2 . If follows that the conductivity must be proportional to the square root of the density ρ of the conducting fluid σ = κ ρ1/2 ,

(16.97)

where κ is a dimensionless conductivity constant. The form (16.97) is required by self-similarity, and is not necessarily realistic (although it is realistic that the conductivity increases with density). However, the conductivity (16.97) is adequate for the purpose of exploring the consequences of dissipation in simple models of black holes.

16.4.10 Tetrad connections The expressions for the tetrad connections for the self-similar spacetime are the same as those (16.17) for a general spherically symmetric spacetime, with just a relabelling of the time and radial coordinates into conformal coordinates t→η ,

r→x.

(16.98)

Specifically, equations (16.17) for the tetrad connections become become Γηxη = hη ,

Γηxx = hx ,

Γηθθ = Γηφφ =

βη , r

Γxθθ = Γxφφ =

βx , r

Γθφφ =

cot θ , r

(16.99)

in which hη and hx have the same physical interpretation discussed in §16.1.4 for the general spherically symmetric case: hη is the proper radial acceleration, and hx is the radial Hubble parameter. Expressions (16.18) and (16.19) for hη and hx translate in the self-similar spacetime to hη ≡ ∂x ln(r ξ η ) ,

hx ≡ ∂η ln(r ξ x ) .

(16.100)

Comparing equations (16.100) to equations (16.18) and (16.23) shows that the vierbein coefficient α and scale factor λ translate in the self-similar spacetime to α=

1 , rξ η

λ=

1 . rξ x

(16.101)

16.4 Self-similar spherically symmetric spacetime

285

16.4.11 Spherical equations carry over to the self-similar case The tetrad-frame Riemann, Weyl, and Einstein tensors in the self-similar spacetime take the same form as in the general spherical case, equations (16.24)–(16.26), with just a relabelling (16.98) into conformal coordinates. Likewise, the equations for the interior mass in §16.1.8, for energy-momentum conservation in §16.1.9, for the first law in §16.1.10, and the various equations for the electromagnetic field in §16.2, all carry through unchanged except for a relabelling (16.98) of coordinates.

16.4.12 From partial to ordinary differential equations The central simplifying feature of self-similar solutions is that they turn a system of partial differential equations into a system of ordinary differential equations. By definition, a dimensionless quantity F (x) is independent of conformal time η. It follows that the partial derivative of any dimensionless quantity F (x) with respect to conformal time η vanishes 0=

∂F (x) = ξ m ∂m F (x) = (ξ η ∂η + ξ x ∂x ) F (x) . ∂η

(16.102)

Consequently the directed radial derivative ∂x F of a dimensionless quantity F (x) is related to its directed time derivative ∂η F by ξx (16.103) ∂x F (x) = − η ∂η F (x) . ξ Equation (16.103) allows radial derivatives to be converted to time derivatives.

16.4.13 Integrals of motion As remarked above, equation (16.102), in self-similar solutions ξ m ∂m F (x) = 0 for any dimensionless function F (x). If both the directed derivatives ∂η F (x) and ∂x F (x) are known from the Einstein equations or elsewhere, then the result will be an integral of motion. The spherically symmetric, self-similar Einstein equations admit two integrals of motion   M 0 = r ξ m ∂m βη = r βx (ξ η hη + ξ x hx ) − ξ η + 4πr2 p + ξ x 4πr2 f , (16.104a) r   M 0 = r ξ m ∂m βx = r βη (ξ η hη + ξ x hx ) + ξ x − 4πr2 ρ + ξ η 4πr2 f . (16.104b) r Taking ξ x times (16.104a) plus ξ η times (16.104b), and then βη times (16.104a) minus βx times (16.104b), gives    (16.105a) 0 = r v (ξ η hη + ξ x hx ) − 4πr2 ξ η ξ x (ρ + p) − (ξ η )2 + (ξ x )2 f , M M = −v + 4πr2 [βx ξ x ρ − βη ξ η p + (βη ξ x − βx ξ η )f ] . (16.105b) 0 = r ξ m ∂m r r

286

General spherically symmetric spacetime

The quantities in square brackets on the right hand sides of equations (16.105) are scalars for each species a, so equations (16.105) can also be written X r v (ξ η hη + ξ x hx ) = 4πr2 ξaη ξax (ρa + pa ) , (16.106a) species a

v

M = 4πr2 r

X

(βa,x ξax ρa − βa,η ξaη pa ) ,

(16.106b)

species a

where the sum is over all species a, and βa,m and ξam are the 4-vectors βm and ξ m expressed in the rest frame of species a. For electrically charged solutions, a third integral of motion comes from Q Q = − v + 4πr2 (ξ x q − ξ η j) r r which is valid in any radial tetrad frame, not just the center-of-mass frame. For a fluid with equation of state p = wρ, a fourth integral comes from considering   0 = r ξ m ∂m (r2 p) = r w ξ η ∂η (r2 ρ) + ξ x ∂x (r2 p) 0 = r ξ m ∂m

(16.107)

(16.108)

and simplifying using the energy conservation equation for ∂η ρ and the momentum conservation equation for ∂x p.

16.4.14 Integration variable It is desirable to choose an integration variable that varies monotonically. A natural choice is the proper time τ in the tetrad frame, since this is guaranteed to increase monotonically. Since the 4-velocity at rest in the tetrad frame is by definition um = {1, 0, 0, 0}, the proper time derivative is related to the directed conformal time derivative in the tetrad frame by d/dτ = um ∂m = ∂η . However, there is another choice of integration variable, the ray-tracing variable X defined by equation (16.79), that is not specifically tied to any tetrad frame, and that has a desirable (tetrad and coordinate) gauge-invariant meaning. The proper time derivative of any dimensionless function F (x) in the tetrad frame is related to its derivative dF/dX with respect to the ray-tracing variable X by ∂η F = um ∂m F = uX ∂X F = −

ξ x dF . r dX

(16.109)

In the third expression, uX ∂X F is um ∂m F expressed in the similarity frame of §16.4.4, the time contribution uη× ∂η× F vanishing in the similarity frame because it is proportional to the conformal time derivative ∂F/∂η× = 0. In the last expression of (16.109), uX has been replaced by −ux = −ξ x /∆1/2 in view of equation (16.77), the minus sign coming from the fact that uX is the radial component of the tetrad 4-velocity of the tetrad frame relative to the similarity frame, while ux in equation (16.77) is the radial component of the tetrad 4-velocity of the similarity frame relative to the tetrad frame. Also in the last expression of (16.109), the directed derivative ∂X with respect to the ray-tracing variable X has been translated into its coordinate partial derivative, ∂X = (∆1/2 /r) ∂/∂X, which follows from the metric (16.80).

16.4 Self-similar spherically symmetric spacetime

287

In summary, the chosen integration variable is the dimensionless ray-tracing variable −X (with a minus because −X is monotonically increasing), the derivative with respect to which, acting on any dimensionless function, is related to the proper time derivative in any tetrad frame (not just the baryonic frame) by −

r d = x ∂η . dX ξ

(16.110)

Equation (16.110) involves ξ x , which is proportional to the proper velocity of the tetrad frame through the similarity frame, equation (16.78), and which therefore, being initially positive, must always remain positive in any tetrad frame attached to a fluid, as long as the fluid does not turn back on itself, as must be true for the self-similar solution to be consistent.

16.4.15 Summary of equations for a single charged fluid For reference, it is helpful to collect here the full set of equations governing self-similar spherically symmetric evolution in the case of a single charged fluid with isotropic equation of state p = p⊥ = w ρ ,

(16.111)

σ = κ ρ1/2 .

(16.112)

and conductivity

In accordance with the arguments in §16.4.9, equations (16.95) and (16.97), self-similarity requires that the pressure-to-density ratio w and the conductivity coefficient κ both be (dimensionless) constants. It is natural to work in the center-of-mass frame of the fluid, which also coincides with the center-of-mass frame of the fluid plus electric field (the electric field, being invariant under Lorentz boosts, does not pick out any particular radial frame). The proper time τ in the fluid frame evolves as −

dτ r = x , dX ξ

(16.113)

which follows from equation (16.110) and the fact that ∂η τ = 1. The circumferential radius r evolves along the path of the fluid as d ln r βη − (16.114) = x . dX ξ Although it is straightforward to write down the equations governing how the tetrad frame moves through the conformal coordinates η and x, there is not much to be gained from this because the conformal coordinates have no fundamental physical significance. Next, the defining equations (16.100) for the proper acceleration hη and Hubble parameter hx yield equations for the evolution of the time and radial components of the conformal Killing vector ξ m −

dξ η = βx − rhη , dX

(16.115a)

288

General spherically symmetric spacetime

dξ x = − βη + rhx , (16.115b) dX in which, in the formula for hη , equation (16.103) has been used to convert the conformal radial derivative ∂x to the conformal time derivative ∂η , and thence to −d/dX by equation (16.110). Next, the Einstein equations (16.26c) and (16.26c) (with coordinates relabeled per (16.98) in the center-ofmass frame (16.33) yield evolution equations for the time and radial components of the vierbein coefficients βm βx dβη (16.116a) = − η rhx , − dX ξ −



dβx βη = x rhη , dX ξ

(16.116b)

where again, in the formula for βη , equation (16.103) has been used to convert the conformal radial derivative ∂x to the conformal time derivative ∂η . The 4 evolution equations (16.115) and (16.116) for ξ m and βm are not independent: they are related by ξ m βm = v, a constant, equation (16.70). To maintain numerical precision, it is important to avoid expressing small quantities as differences of large quantities. In practice, a suitable choice of variables to integrate proves to be ξ η + ξ x , βη − βx , and βx , each of which can be tiny in some circumstances. Starting from these variables, the following equations yield ξ η − ξ x , along with the interior mass M and the horizon function ∆, equations (16.69) and (16.71), in a fashion that ensures numerical stability: 2v − (ξ η + ξ x )(βη + βx ) , (16.117a) ξη − ξx = βη − βx 2M = 1 + (βη + βx )(βη − βx ) , r

(16.117b)

∆ = (ξ η + ξ x )(ξ η − ξ x ) .

(16.117c)

The evolution equations (16.115) and (16.116) involve hη and hx . The integrals of motion considered in §16.4.13 yield explicit expressions for hη and hx not involving any derivatives. For the Hubble parameter hx , equation (16.105a) gives ξη ξη 4πε , (16.118) rhx = − x rhη + ξ v where ε is the dimensionless enthalpy ε ≡ r2 (1 + w)ρ .

(16.119)

For the proper acceleration hη , a somewhat lengthy calculation starting from the integral of motion (16.108), and simplifying using the integral of motion (16.107) for Q, the expression (16.118) for hx , Maxwell’s equation (16.54b) [with the relabelling (16.98)], and the conductivity (16.112) in Ohm’s law (16.96), gives  ξ x 8πw⊥ (βx ξ x − wβη ξ η )r2 ρ + [v + (1 + w)4πrσξ η ] Q2 /r2 − w(4πξ η ε)2 /v . (16.120) rhη = 4πε [(ξ x )2 − w(ξ η )2 ]

16.4 Self-similar spherically symmetric spacetime

289

Two more equations complete the suite. The first, which represents energy conservation for the fluid, can be written as an equation governing the entropy S of the fluid −

d ln S σQ2 , = dX r(1 + w)ρξ x

(16.121)

in which the S is (up to an arbitrary constant) the entropy of a comoving volume element V ∝ r3 ξ x of the fluid S ≡ r3 ξ x ρ1/(1+w) . That equation (16.121) really is an entropy equation can be confirmed by rewriting it as   σQ2 dV 1 dρV = jE = 4 , +p V dτ dτ r

(16.122)

(16.123)

in which jE is recognized as the Ohmic dissipation, the rate per unit volume at which the volume element V is being heated. The final equation represents electromagnetic energy conservation, equation (16.57a), which can be written −

d ln Q 4πrσ =− x . dX ξ

(16.124)

The (heat) energy going into the fluid is balanced by the (free) energy coming out of the electromagnetic field.

17 The interiors of spherical black holes

As discussed in Chapter 8, the Reissner-Nordstr¨ om geometry for an ideal charged spherical black hole contains mathematical wormhole and white hole extensions to other universes. In reality, these extensions are not expected to occur, thanks to the mass inflation instability discovered by Poisson & Israel (1990).

17.1 The mechanism of mass inflation 17.1.1 Reissner-Nordstr¨ om phase Figure 17.1 illustrates how the two Einstein equations (16.52) produce the three phases of mass inflation inside a charged spherical black hole. During the initial phase, illustrated in the top panel of Figure 17.1, the spacetime geometry is wellapproximated by the vacuum, Reissner-Nordstr¨ om geometry. During this phase the radial energy flux f is effectively zero, so βr remains constant, according to equation (16.52b). The change in the radial velocity βt , equation (16.52a), depends on the competition between the Newtonian gravitational force −M/r2 , which is always attractive (tending to make the radial velocity βt more negative), and the gravitational force −4πrp sourced by the radial pressure p. In the Reissner-Nordstr¨ om geometry, the static electric field produces a negative radial pressure, or tension, p = −Q2 /(8πr4 ), which produces a gravitational repulsion −4πrp = Q2 /(2r3 ). At some point (depending on the charge-to-mass ratio) inside the outer horizon, the gravitational repulsion produced by the tension of the electric field exceeds the attraction produced by the interior mass M , so that the radial velocity βt slows down. This regime, where the (negative) radial velocity βt is slowing down (becoming less negative), while βr remains constant, is illustrated in the top panel of Figure 17.1. If the initial Reissner-Nordstr¨ om phase were to continue, then the radial 4-gradient βm would become lightlike. In the Reissner-Nordstr¨ om geometry this does in fact happen, and where it happens defines the inner horizon. The problem with this is that the lightlike 4-vector βm points in one direction for ingoing frames, and in the opposite direction for outgoing frames. If βm becomes lightlike, then ingoing and outgoing frames are streaming through each other at the speed of light. This is the infinite blueshift at the inner horizon first pointed out by Penrose (1968).

17.1 The mechanism of mass inflation βt

βr

tgo ou

ing

o ing

ing

1. RN

βr

oi tg ou

g

in

go

in

ng

2. Inflation

βr

oi tg ou

g

in

go

in

ng

3. Collapse

Figure 17.1 Spacetime diagrams of the tetrad-frame 4-vector βm , equation (16.8), illustrating qualitatively the three successive phases of mass inflation: 1. (top) the Reissner-Nordstr¨ om phase, where inflation ignites; 2. (middle) the inflationary phase itself; and 3. (bottom) the collapse phase, where inflation comes to an end. In each diagram, the arrowed lines labeled ingoing and outgoing illustrate two representative examples of the 4-vector {βt , βr }, while the double-arrowed lines illustrate the rate of change of these 4-vectors implied by Einstein’s equations (16.52). Inside the horizon of a black hole, all locally inertial frames necessarily fall inward, so the radial velocity βt ≡ ∂t r is always negative. A locally inertial frame is ingoing or outgoing depending on whether the proper radial gradient βr ≡ ∂r r measured in that frame is positive or negative.

291

292

The interiors of spherical black holes

If there were no matter present, or if there were only one stream of matter, either ingoing or outgoing but not both, then βm could indeed become lightlike. But if both ingoing and outgoing matter are present, even in the tiniest amount, then it is physically impossible for the ingoing and outgoing frames to stream through each other at the speed of light. If both ingoing and outgoing streams are present, then as they race through each other ever faster, they generate a radial pressure p, and an energy flux f , which begin to take over as the main source on the right hand side of the Einstein equations (16.52). This is how mass inflation is ignited.

17.1.2 Inflationary phase The infalling matter now enters the second, mass inflationary phase, illustrated in the middle panel of Figure 17.1. During this phase, the gravitational force on the right hand side of the Einstein equation (16.52a) is dominated by the pressure p produced by the counter-streaming ingoing and outgoing matter. The mass M is completely sub-dominant during this phase (in this respect, the designation “mass inflation” is misleading, since although the mass inflates, it does not drive inflation). The counter-streaming pressure p is positive, and so accelerates the radial velocity βt (makes it more negative). At the same time, the radial gradient βr is being driven by the energy flux f , equation (16.52b). For typically low accretion rates, the streams are cold, in the sense that the streaming energy density greatly exceeds the thermal energy density, even if the accreted material is at relativistic temperatures. This follows from the fact that for mass inflation to begin, the gravitational force produced by the counter-streaming pressure p must become comparable to that produced by the mass M , which for streams of low proper density requires a hyper-relativistic streaming velocity. For a cold stream of proper density ρ moving at 4-velocity um ≡ {ut , ur , 0, 0}, the streaming energy flux would be f ∼ ρut ur , while the streaming pressure would be p ∼ ρ(ur )2 . Thus their ratio f /p ∼ ut /ur is slightly greater than one. It follows that, as illustrated in the middle panel of Figure 17.1, the change in βr slightly exceeds the change in βt , which drives the 4-vector βm , already nearly lightlike, to be even more nearly lightlike. This is mass inflation. Inflation feeds on itself. The radial pressure p and energy flux f generated by the counter-streaming ingoing and outgoing streams increase the gravitational force. But, as illustrated in the middle panel of Figure 17.1, the gravitational force acts in opposite directions for ingoing and outgoing streams, tending to accelerate the streams faster through each other. An intuitive way to understand this is that the gravitational force is always inwards, meaning in the direction of smaller radius, but the inward direction is towards the black hole for ingoing streams, and away from the black hole for outgoing streams. The feedback loop in which the streaming pressure and flux increase the gravitational force, which accelerates the streams faster through each other, which increases the streaming pressure and flux, is what drives mass inflation. Inflation produces an exponential growth in the streaming energy, and along with it the interior mass, and the Weyl curvature.

17.2 The far future?

293

17.1.3 Collapse phase It might seem that inflation is locked into an exponential growth from which there is no exit. But the Einstein equations (16.52) have one more trick up their sleave. For the counter-streaming velocity to continue to increase requires that the change in βr from equation (16.52b) continues to exceed the change in βt from equation (16.52a). This remains true as long as the counter-streaming pressure p and energy flux f continue to dominate the source on the right hand side of the equations. But the mass term −M/r2 also makes a contribution to the change in βt , equation (16.52a). As will be seen in the examples of the next two sections, §§?? and ??, at least in the case of pressureless streams the mass term exponentiates slightly faster than the pressure term. At a certain point, the additional acceleration produced by the mass means that the combined gravitational force M/r2 + 4πrp exceeds 4πrf . Once this happens, the 4-vector βm , instead of being driven to becoming more lightlike, starts to become less lightlike. That is, the counter-streaming velocity starts to slow. At that point inflation ceases, and the streams quickly collapse to zero radius. It is ironic that it is the increase of mass that brings mass inflation to an end. Not only does mass not drive mass inflation, but as soon as mass begins to contribute significantly to the gravitational force, it brings mass inflation to an end.

17.2 The far future? The Penrose diagram of a Reissner-Nordstr¨ om or Kerr-Newman black hole indicates that an observer who passes through the outgoing inner horizon sees the entire future of the outside universe go by. In a sense, this is “why” the outside universe appears infinitely blueshifted. This raises the question of whether what happens at the outgoing inner horizon of a real black hole indeed depends on what happens in the far future. If it did, then the conclusions of previous sections, which are based in part on the proposition that the accretion rate is approximately constant, would be suspect. A lot can happen in the far future, such as black hole mergers, the Universe ending in a big crunch, Hawking evaporation, or something else beyond our current ken. Ingoing and outgoing observers both see each other highly blueshifted near the inner horizon. An outgoing observer sees ingoing observers from the future, while an ingoing observer sees outgoing observers from the past. If the streaming 4-velocity between ingoing and outgoing streams is um ba , equation (??), then the proper time dτb that elapses on stream b observed by stream a equals the blueshift factor utba times the proper time dτa experienced by stream a, dτb = utba dτa .

(17.1)

17.2.1 Inflationary phase A physically relevant timescale for the observing stream a is how long it takes for the blueshift to increase by one e-fold, which is dτa /d ln utba . During the inflationary phase, stream a sees the amount of time elapsed

294

The interiors of spherical black holes

on stream b through one e-fold of blueshift to be utba

2Cb r− dτa = . t d ln uba λ

(17.2)

The right hand side of equation (17.2) is derived from utba ≈ 2|ua ub |, |dra /dτa | = |βa,t | ≈ βua , ra ≈ r− , and the approximations (??) and (??) valid during the inflationary phase. The constants Cb and λ on the right hand side of equation (17.2) are typically of order unity, while r− is the radius of the inner horizon where mass inflation takes place. Thus the right hand side of equation (17.2) is of the order of one black hole crossing time. In other words, stream a sees approximately one black hole crossing time elapse on stream b for each e-fold of blueshift. For astronomically realistic black holes, exponentiating the Weyl curvature up to the Planck scale will take typically a few hundred e-folds of blueshift, as illustrated for example in Figure 17.10. Thus what happens at the inner horizon of a realistic black hole before quantum gravity intervenes depends only on the immediate past and future of the black hole – a few hundred black hole crossing times – not on the distant future or past. This conclusion holds even if the accretion rate of one of the ingoing or outgoing streams is tiny compared to the other, as considered in §??. From a stream’s own point of view, the entire inflationary episode goes by in a flash. At the onset of inflation, where β goes through its minimum, at µa u2a = µb u2b = λ according to equation (??), the blueshift utba is already large 2λ , (17.3) utba = √ µa µb thanks to the small accretion rates µa and µb . During the first e-fold of blueshift, each stream experiences a √ proper time of order µa µb times the black hole crossing time, which is tiny. Subsequent e-folds of blueshift race by in proportionately shorter proper times.

17.2.2 Collapse phase The time to reach the collapse phase is another matter. According to the estimates in §§?? and ??, reaching the collapse phase takes of order ∼ 1/µa e-folds of blueshift, where µa is the larger of the accretion rates of the ingoing and outgoing streams, equation (??). Thus in reaching the collapse phase, each stream has seen approximately 1/µa black hole crossing times elapse on the other stream. But 1/µa black hole crossing times is just the accretion time – essentially, how long the black hole has existed. This timescale, the age of the black hole, is not infinite, but it can hardly be expected that the accretion rate of a black hole would be constant over its lifetime. If the accretion rate were in fact constant, and if quantum gravity did not intervene and the streams remained non-interacting, then indeed streams inside the black hole would reach the collapse phase, whereupon they would plunge to a spacelike singularity at zero radius. For example, in the self-similar models illustrated in Figures ?? and 17.10, outgoing baryons hitting the central singularity see ingoing dark matter accreted a factor of two into the future (specifically, for ingoing baryons and outgoing dark matter hitting the singularity, the radius of the outer horizon when the dark matter is accreted is twice that when the baryons

17.3 Self-similar models of the interior structure of black holes

295

were accreted; the numbers are 2.11 for M˙ • = 0.03, 1.97 for M˙ • = 0.01, and unknown for M˙ • = 0.003 because in that case the numerics overflow before the central singularity is reached). The same conclusion applies to the ultra-hard fluid models illustrated in Figure 17.8: if the accretion rate is constant, then outgoing streams see only a factor of order unity or a few into the future before hitting the central singularity. If on the other hard the accretion rate decreases sufficiently rapidly with time, then it is possible that an outgoing stream never reaches the collapse phase, because the number of e-folds to reach the collapse phase just keeps increasing as the accretion rate decreases. By contrast, an ingoing stream is always liable to reach the collapse phase, if quantum gravity does not intervene, because an ingoing stream sees the outgoing stream from the past, when the accretion rate was liable to have been larger. However, as already remarked in §??, in astronomically realistic black holes, it is only for large accretion rates, such as may occur when the black hole first collapses (M˙ • & 0.01 for the models illustrated in Fig. 17.10), that the collapse phase has a chance of being reached before the Weyl curvature exceeds the Planck scale. To summarize, it is only streams accreted during the first few hundred or so black hole crossing times since a black hole’s formation that have a chance of hitting a central spacelike singularity. Streams accreted at later times, whether ingoing or outgoing, are likely to meet their fate in the inflationary zone at the inner horizon, where the Weyl curvature exponentiates to the Planck scale and beyond.

17.3 Self-similar models of the interior structure of black holes The apparatus is now in hand actually to do some real calculations of the interior structure of black holes. All the models presented in this section are spherical and self-similar. See Hamilton & Pollack (2005, PRD 71, 084031 & 084031) and Wallace, Hamilton & Polhemus (2008, arXiv:0801.4415) for more.

17.3.1 Boundary conditions at an outer sonic point Because information can propagate only inward inside the horizon of a black hole, it is natural to set the boundary conditions outside the horizon. The policy adopted here is to set them at a sonic point, where the infalling fluid accelerates from subsonic to supersonic. The proper 3-velocity of the fluid through the self-similar frame is ξ x /ξ η , equation (16.78) (the velocity ξ x /ξ η is positive falling inward), and the sound speed is r √ pb (17.4) = wb , sound speed = ρb and sonic points occur where the velocity equals the sound speed √ ξx = ± wb ξη

at sonic points .

(17.5)

The denominator of the expression (16.120) for the proper acceleration g is zero at sonic points, indicating that the acceleration will diverge unless the numerator is also zero. What happens at a sonic point depends

296

The interiors of spherical black holes

on whether the fluid transitions from subsonic upstream to supersonic downstream (as here) or vice versa. If (as here) the fluid transitions from subsonic to supersonic, then sound waves generated by discontinuities near the sonic point can propagate upstream, plausibly modifying the flow so as to ensure a smooth transition through the sonic point, effectively forcing the numerator, like the denominator, of the expression (16.120) to pass through zero at the sonic point. Conversely, if the fluid transitions from supersonic to subsonic, then sound waves cannot propagate upstream to warn the incoming fluid that a divergent acceleration is coming, and the result is a shock wave, where the fluid accelerates discontinuously, is heated, and thereby passes from supersonic to subsonic. The solutions considered here assume that the acceleration g at the sonic point is not only continuous [so the numerator of (16.120) is zero] but also differentiable. Such a sonic point is said to be regular, and the assumption imposes two boundary conditions at the sonic point. The accretion in real black holes is likely to be much more complicated, but the assumption of a regular sonic point is the simplest physically reasonable one.

17.3.2 Mass and charge of the black hole The mass M• and charge Q• of the black hole at any instant can be defined to be those that would be measured by a distant observer if there were no mass or charge outside the sonic point M• = M +

Q2 , 2r

Q• = Q

at the sonic point .

(17.6)

The mass M• in equation (17.6) includes the mass-energy Q2 /2r that would be in the electric field outside the sonic point if there were no charge outside the sonic point, but it does not include mass-energy from any additional mass or charge that might be outside the sonic point. In self-similar evolution, the black hole mass increases linearly with time, M• ∝ t, where t is the proper time at rest far from the black hole. As discussed in §??, this time t equals the proper time τd = rξdη /v recorded by dark matter clocks that free-fall radially from zero velocity at infinity. Thus the mass accretion rate M˙ • is vM• M• dM• = = at the sonic point . (17.7) M˙ • ≡ dt τd rξdη If there is no mass outside the sonic point (apart from the mass-energy in the electric field), then a freelyfalling dark matter particle will have βd,x = 1

at the sonic point ,

(17.8)

which can be taken as the boundary condition on the dark matter at the sonic point, for either massive or massless dark matter. Equation (17.8) follows from the facts that the geodesic equations in empty space around a charged black hole (Reissner-Nordstr¨om metric) imply that βd,x = constant for a radially freefalling particle (the same conclusion can drawn from the Einstein equation (16.26c)), and that a particle at rest at infinity satisfies βd,η = ∂d,η r = 0, and consequently βd,x = 1 from equation (16.69) with r → ∞. As remarked following equation (16.72), the residual gauge freedom in the global rescaling of conformal

17.3 Self-similar models of the interior structure of black holes

297

time η allows the expansion velocity v to be adjusted at will. One choice suggested by equation (17.7) is to set (but one could equally well set v to the expansion velocity of the horizon, v = r˙+ , for example) v = M˙ • ,

(17.9)

which is equivalent to setting ξdη =

M• r

at the sonic point .

(17.10)

Equation (17.10) is not a boundary condition: it is just a choice of units of conformal time η. Equation (17.10) and the boundary condition (17.8) coupled with the scalar relations (16.69) and (16.70) fully determine the dark matter 4-vectors βd,m and ξdm at the sonic point.

17.3.3 Equation of state The density ρb and temperature Tb of an ideal relativistic baryonic fluid in thermodynamic equilibrium are related by π2 g 4 T , (17.11) ρb = 30 b where 7 g = gB + gF (17.12) 8 is the effective number of relativistic particle species, with gB and gF being the number of bosonic and fermionic species. If the expected increase in g with temperature T is modeled (so as not to spoil selfsimilarity) as a weak power law g/gp = T ǫ , with gp the effective number of relativistic species at the Planck temperature, then the relation between density ρb and temperature Tb is ρb =

π 2 gp (1+w)/w T , 30 b

(17.13)

with equation of state parameter wb = 1/(3 + ǫ) slightly less than the standard relativistic value w = 1/3. In the models considered here, the baryonic equation of state is taken to be wb = 0.32 .

(17.14)

The effective number gp is fixed by setting the number of relativistic particles species to g = 5.5 at T = 10 MeV, corresponding to a plasma of relativistic photons, electrons, and positrons. This corresponds to choosing the effective number of relativistic species at the Planck temperature to be gp ≈ 2,400, which is not unreasonable. The chemical potential of the relativistic baryonic fluid is likely to be close to zero, corresponding to equal numbers of particles and anti-particles. The entropy Sb of a proper Lagrangian volume element V of the fluid is then (ρb + pb )V , (17.15) Sb = Tb

298

The interiors of spherical black holes

which agrees with the earlier expression (16.122), but now has the correct normalization.

17.3.4 Entropy creation One fundamentally interesting question about black hole interiors is how much entropy might be created inside the horizon. Bekenstein first argued that a black hole should have a quantum entropy proportional to its horizon area A, and Hawking (1974) supplied the constant of proportionality 1/4 in Planck units. The Bekenstein-Hawking entropy SBH is, in Planck units c = G = ~ = 1, SBH =

A . 4

(17.16)

2 For a spherical black hole of horizon radius r+ , the area is A = 4πr+ . Hawking showed that a black hole has a temperature TH equal to 1/(2π) times the surface gravity g+ at its horizon, again in Planck units, g+ . (17.17) TH = 2π The surface gravity is defined to be the proper radial acceleration measured by a person in free-fall at the horizon. For a spherical black hole, the surface gravity is g+ = −Dt βt = M/r2 + 4πrp evaluated at the horizon, equation (16.52a). The proper velocity of the baryonic fluid through the sonic point equals ξ x /ξ η , equation (16.78). Thus the entropy Sb accreted through the sonic point per unit proper time of the fluid is

(1 + wb )ρb 4πr2 ξ x dSb . = dτ Tb ξη

(17.18)

The horizon radius r+ , which is at fixed conformal radius x, expands in proportion to the conformal factor, r+ ∝ a, and the conformal factor a increases as d ln a/dτ = ∂η ln a = v/(rξ η ), so the Bekenstein-Hawking 2 entropy SBH = πr+ increases as 2 2πr+ v dSBH . = dτ rξ η

(17.19)

Thus the entropy Sb accreted through the sonic point per unit increase of the Bekenstein-Hawking entropy SBH is dSb (1 + wb )ρb 4πr3 ξ x = . (17.20) 2 vT dSBH 2πr+ b r=rs

Inside the sonic point, dissipation increases the entropy according to equation (16.121). Since the entropy can diverge at a central singularity where the density diverges, and quantum gravity presumably intervenes at some point, it makes sense to truncate the production of entropy at a “splat” point where the density ρb hits a prescribed splat density ρ# ρb = ρ# .

(17.21)

Integrating equation (16.121) from the sonic point to the splat point yields the ratio of the entropies at the

17.3 Self-similar models of the interior structure of black holes

299

sonic and splat points. Multiplying the accreted entropy, equation (17.20), by this ratio yields the rate of increase of the entropy of the black hole, truncated at the splat point, per unit increase of its BekensteinHawking entropy (1 + wb )ρb 4πr3 ξ x dSb . (17.22) = 2 vT dSBH 2πr+ b ρb =ρ#

17.3.5 Holography The idea that the entropy of a black hole cannot exceed its Bekenstein-Hawking entropy has motivated holographic conjectures that the degrees of freedom of a volume are somehow encoded on its boundary, and consequently that the entropy of a volume is bounded by those degrees of freedom. Various counterexamples dispose of most simple-minded versions of holographic entropy bounds. The most successful entropy bound, with no known counter-examples, is Bousso’s covariant entropy bound (Bousso 2002, Rev. Mod. Phys. 74, 825). The covariant entropy bound concerns not just any old 3-dimensional volume, but rather the 3-dimensional volume formed by a null hypersurface, a lightsheet. For example, the horizon of a black hole is a null hypersurface, a lightsheet. The covariant entropy bound asserts that the entropy that passes (inward or outward) through a lightsheet that is everywhere converging cannot exceed 1/4 of the 2-dimensional area of the boundary of the lightsheet. In the self-similar black holes under consideration, the horizon is expanding, and outgoing lightrays that sit on the horizon do not constitute a converging lightsheet. However, a spherical shell of ingoing lightrays that starts on the horizon falls inwards and therefore does form a converging lightsheet, and a spherical shell of outgoing lightrays that starts just slightly inside the horizon also falls inward and forms a converging lightsheet. The rate at which entropy Sb passes through such ingoing or outgoing spherical lightsheets per unit decrease in the area Scov ≡ πr2 of the lightsheet is 2 dSb v = dSb r+ (17.23) dScov dSBH r2 ξ x |βη ∓ βx | ,

in which the ∓ sign is − for ingoing, + for outgoing lightsheets. A sufficient condition for Bousso’s covariant entropy bound to be satisfied is |dSb /dScov | ≤ 1 .

(17.24)

17.3.6 Black hole accreting a neutral relativistic plasma The simplest case to consider is that of a black hole accreting a neutral relativistic plasma. In the self-similar solutions, the charge of the black hole is produced self-consistently by the accreted charge of the baryonic fluid, so a neutral fluid produces an uncharged black hole. Figure 17.2 shows the baryonic density ρb and Weyl curvature C inside the uncharged black hole. The mass and accretion rate have been taken to be M• = 4 × 106 M⊙ ,

M˙ • = 10−16 ,

(17.25)

The interiors of spherical black holes

dSb / dSBH

Planck scale

(Planck units)

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

horizon

300

−C

ρb

1010

1020 1030 1040 Radius r (Planck units)

1050

Figure 17.2 An uncharged baryonic plasma falls into an uncharged spherical black hole. The plot shows in Planck units, as a function of radius, the plasma density ρb , the Weyl curvature scalar C (which is negative), and the rate dSb /dSBH of increase of the plasma entropy per unit increase in the Bekenstein-Hawking entropy of the black hole. The mass is M• = 4 × 106 M⊙ , the accretion rate is M˙ • = 10−16 , and the equation of state is wb = 0.32.

which are motivated by the fact that the mass of the supermassive black hole at the center of the Milky Way is 4 × 106 M⊙ , and its accretion rate is 4 × 106 M⊙ 6 × 1060 Planck units Mass of MW black hole ≈ ≈ ≈ 10−16 . age of Universe 1010 yr 4 × 1044 Planck units

(17.26)

Figure 17.2 shows that the baryonic plasma plunges uneventfully to a central singularity, just as in the Schwarzschild solution. The Weyl curvature scalar hits the Planck scale, |C| = 1, while the baryonic proper density ρb is still well below the Planck density, so this singularity is curvature-dominated.

17.3.7 Black hole accreting a non-conducting charged relativistic plasma The next simplest case is that of a black hole accreting a charged but non-conducting relativistic plasma. Figure 17.3 shows a black hole with charge-to-mass Q• /M• = 10−5 , but otherwise the same parameters as in the uncharged black hole of §17.3.6: M• = 4 × 106 M⊙ , M˙ • = 10−16 , and wb = 0.32. Inside the outer horizon, the baryonic plasma, repelled by the electric charge of the black hole self-consistently generated by the accretion of the charged baryons, becomes outgoing. Like the Reissner-Nordstr¨ om geometry, the black hole has an (outgoing) inner horizon. The baryons drop through the inner horizon, shortly after which the self-similar solution terminates at an irregular sonic point, where the proper acceleration diverges. Normally

17.3 Self-similar models of the interior structure of black holes

dSb / dSBH

horizon

inner horizon

(Planck units)

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

301

|C| ρe ρb

1010

1020 1030 1040 Radius r (Planck units)

1050

Figure 17.3 A plasma that is charged but non-conducting. The black hole has an inner horizon like the Reissner-Nordstr¨ om geometry. The self-similar solution terminates at an irregular sonic point just beneath the inner horizon. The mass is M• = 4 × 106 M⊙ , accretion rate M˙ • = 10−16 , equation of state wb = 0.32, and black hole charge-to-mass Q• /M• = 10−5 .

this is a signal that a shock must form, but even if a shock is introduced, the plasma still terminates at an irregular sonic point shortly downstream of the shock. The failure of the self-similar to continue does not invalidate the solution, because the failure is hidden beneath the inner horizon, and cannot be communicated to infalling matter above it. The solution is nevertheless not realistic, because it assumes that there is no ingoing matter, such as would inevitably be produced for example by infalling neutral dark matter. Such ingoing matter would appear infinitely blueshifted to the outgoing baryons falling through the inner horizon, which would produce mass inflation, as in §17.3.10.

17.3.8 Black hole accreting a conducting relativistic plasma What happens if the baryonic plasma is not only electrically charged but also electrically conducting? If the conductivity is small, then the solutions resemble the non-conducting solutions of the previous subsection, §17.3.7. But if the conductivity is large enough effectively to neutralize the plasma as it approaches the center, then the plasma can plunge all the way to the central singularity, as in the uncharged case in §17.3.6. Figure 17.4 shows a case in which the conductivity has been tuned to equal, within numerical accuracy, the critical conductivity κb = 0.35 above which the plasma collapses to a central singularity. The parameters are

The interiors of spherical black holes

dSb / dSBH |dSb / dScov| Planck scale

(Planck units)

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

horizon

302

|C| ρb

1010

ρe

1020 1030 1040 Radius r (Planck units)

1050

Figure 17.4 Here the baryonic plasma is charged, and electrically conducting. The conductivity is at (within numerical accuracy) the threshold above which the plasma plunges to a central singularity. The mass is M• = 4 × 106 M⊙ , the accretion rate M˙ • = 10−16 , the equation of state wb = 0.32, the charge-to-mass Q• /M• = 10−5 , and the conductivity parameter κb = 0.35. Arrows show how quantities vary a factor of 10 into the past and future.

otherwise the same as in previous subsections: a mass of M• = 4 × 106 M⊙ , an accretion rate M˙ • = 10−16 , an equation of state wb = 0.32, and a black hole charge-to-mass of Q• /M• = 10−5 . The solution at the critical conductivity exhibits the periodic self-similar behavior first discovered in numerical simulations by Choptuik (1993, PRL 70, 9), and known as “critical collapse” because it happens at the borderline between solutions that do and do not collapse to a black hole. The ringing of curves in Figure 17.4 is a manifestation of the self-similar periodicity, not a numerical error. These solutions are not subject to the mass inflation instability, and they could therefore be prototypical of the behavior inside realistic rotating black holes. For this to work, the outward transport of angular momentum inside a rotating black hole must be large enough effectively to produce zero angular momentum at the center. My instinct is that angular momentum transport is probably not strong enough, but I do not know this for sure. If angular momentum transport is not strong enough, then mass inflation will take place. Figure 17.4 shows that the entropy produced by Ohmic dissipation inside the black hole can potentially exceed the Bekenstein-Hawking entropy of the black hole by a large factor. The Figure shows the rate dSb /dSBH of increase of entropy per unit increase in its Bekenstein-Hawking entropy, as a function of the hypothetical splat point above which entropy production is truncated. The rate is almost independent of the black hole mass M• at fixed splat density ρ# , so it is legitimate to interpret dSb /dSBH as the cumulative

17.3 Self-similar models of the interior structure of black holes

dSb / dSBH

Planck scale

1010

horizon

|dSb / dScov|

(Planck units)

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

303

ρe

|C| ρb

1020 1030 1040 Radius r (Planck units)

1050

Figure 17.5 This black hole creates a lot of entropy by having a large charge-to-mass Q• /M• = 0.8 and a low accretion rate M˙ • = 10−28 . The conductivity parameter κb = 0.35 is again at the threshold above which the plasma plunges to a central singularity. The equation of state is wb = 0.32.

entropy created inside the black hole relative to the Bekenstein-Hawking entropy. Truncated at the Planck scale, |C| = 1, the entropy relative to Bekenstein-Hawking is dSb /dSBH ≈ 1010 . Generally, the smaller the accretion rate M˙ • , the more entropy is produced. If moreover the charge-tomass Q• /M• is large, then the entropy can be produced closer to the outer horizon. Figure 17.5 shows a model with a relatively large charge-to-mass Q• /M• = 0.8, and a low accretion rate M˙ • = 10−28 . The large charge-to-mass ratio in spite of the relatively high conductivity requires force-feeding the black hole: the sonic point must be pushed to just above the horizon. The large charge and high conductivity leads to a burst of entropy production just beneath the horizon. If the entropy created inside a black hole exceeds the Bekenstein-Hawking entropy, and the black hole later evaporates radiating only the Bekenstein-Hawking entropy, then entropy is destroyed, violating the second law of thermodynamics. This startling conclusion is premised on the assumption that entropy created inside a black hole accumulates additively, which in turn derives from the assumption that the Hilbert space of states is multiplicative over spacelike-separated regions. This assumption, called locality, derives from the fundamental proposition of quantum field theory in flat space that field operators at spacelike-separated points commute. This reasoning is essentially the same as originally led Hawking (1976) to conclude that black holes must destroy information. The same ideas that motivate holography also rescue the second law. If the future lightcones of spacelikeseparated points do not intersect, then the points are permanently out of communication, and can behave

304

The interiors of spherical black holes

Sb = S BH

Sb >> S BH Infalling matter

S

b

<

S

BH

Figure 17.6 Partial Penrose diagram of the black hole. The entropy passing through the spacelike slice before the black hole evaporates exceeds that passing through the spacelike slice after the black hole evaporates, apparently violating the second law of thermodynamics. However, the entropy passing through any null slice respects the second law.

like alternate quantum realities, like Schr¨odinger’s dead-and-alive quantum cat. Just as it is not legitimate to the add the entropies of the dead cat and the live cat, so also it is not legitimate to add the entropies of regions inside a black hole whose future lightcones do not intersect. The states of such separated regions, instead of being distinct, are quantum entangled with each other. Figures 17.4 and 17.5 show that the rate |dSb /dScov | at which entropy passes through ingoing or outgoing spherical lightsheets is less than one at all scales below the Planck scale. This shows not only that the black holes obey Bousso’s covariant entropy bound, but also that no individual observer inside the black hole sees more than the Bekenstein-Hawking entropy on their lightcone. No observer actually witnesses a violation of the second law.

17.3.9 Black hole accreting a charged massless scalar field The charged, non-conducting plasma considered in §17.3.7 fell through an (outgoing) inner horizon without undergoing mass inflation. This can be attributed to the fact that relativistic counter-streaming could not occur: there was only a single (outgoing) fluid, and the speed of sound in the fluid was less than the speed of light. In reality, unless dissipation destroys the inner horizon as in §17.3.8, then relativistic counter-streaming between ingoing and outgoing fluids will undoubtedly take place, through gravitational waves if nothing else. One way to allow relativistic counter-streaming is to let the speed of sound be the speed of light. This is true in a massless scalar (= spin-0) field φ, which has an equation of state wφ = 1. Figure 17.7 shows a black hole that accretes a charged, non-conducting fluid with this equation of state. The parameters are otherwise the same as as in previous subsections: a mass of M• = 4 × 106 M⊙ , an accretion rate of M˙ • = 10−16 , and a black hole charge-to-mass of Q• /M• = 10−5 . As the Figure shows, mass inflation takes place just above

305

horizon

(Planck units)

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

mass inflation

17.3 Self-similar models of the interior structure of black holes

|C| ρe ρφ

1010

1020 1030 1040 Radius r (Planck units)

1050

Figure 17.7 Instead of a relativistic plasma, this shows a charged scalar field φ whose equation of state wφ = 1 means that the speed of sound equals the speed of light. The scalar field therefore supports relativistic counter-streaming, as a result of which mass inflation occurs just above the erstwhile inner horizon. The mass is M• = 4 × 106 M⊙ , the accretion rate M˙ • = 10−16 , the charge-to-mass Q• /M• = 10−5 , and the conductivity is zero.

the place where the inner horizon would be. During mass inflation, the density ρφ and the Weyl scalar C rapidly exponentiate up to the Planck scale and beyond. One of the remarkable features of the mass inflation instability is that the smaller the accretion rate, the more violent the instability. Figure 17.8 shows mass inflation in a black hole of charge-to-mass Q• /M• = 0.8 accreting a massless scalar field at rates M˙ • = 0.01, 0.003, and 0.001. The charge-to-mass has been chosen to be largish so that the inner horizon is not too far below the outer horizon, and the accretion rates have been chosen to be large because otherwise the inflationary growth rate is too rapid to be discerned easily on the graph. The density ρφ and Weyl scalar C exponentiate along with, and in proportion to, the interior mass M , which increases as the radius r decreases as M ∝ exp(− ln r/M˙ • ) .

(17.27)

Physically, the scale of length of inflation is set by how close to the inner horizon infalling material approaches before mass inflation begins. The smaller the accretion rate, the closer the approach, and consequently the shorter the length scale of inflation.

The interiors of spherical black holes

0.001

(Planck units)

01 0.

ρe

.1

inner horizon

3 0.00

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

horizon

306

|C| ρφ

.2 .5 1 2 Radius r (geometric units)

Figure 17.8 The density ρφ and Weyl curvature scalar |C| inside a black hole accreting a massless scalar field. The graph shows three cases, with mass accretion rates M˙ • = 0.01, 0.003, and 0.001. The graph illustrates that the smaller the accretion rate, the faster the density and curvature inflate. Mass inflation destroys the inner horizon: the dashed vertical line labeled “inner horizon” shows the position that the inner horizon would have if mass inflation did not occur. The black hole mass is M• = 4 × 106 M⊙ , the charge-to-mass is Q• /M• = 0.8, and the conductivity is zero.

17.3.10 Black hole accreting charged baryons and dark matter No scalar field (massless or otherwise) has yet been observed in nature, although it is supposed that the Higgs field is a scalar field, and it is likely that cosmological inflation was driven by a scalar field. Another way to allow mass inflation in simple models is to admit not one but two fluids that can counter-stream relativistically through each other. A natural possibility is to feed the black hole not only with a charged relativistic fluid of baryons but also with neutral pressureless dark matter that streams freely through the baryons. The charged baryons, being repelled by the electric charge of the black hole, become outgoing, while the neutral dark matter remains ingoing. Figure 17.9 shows that relativistic counter-streaming between the baryons and the dark matter causes the center-of-mass density ρ and the Weyl curvature scalar C to inflate quickly up to the Planck scale and beyond. The ratio of dark matter to baryonic density at the sonic point is ρd /ρb = 0.1, but otherwise the parameters are the generic parameters of previous subsections: M• = 4 × 106 M⊙ , M˙ • = 10−16 , wb = 0.32, Q• /M• = 10−5 , and zero conductivity. Almost all the center-of-mass energy ρ is in the counter-streaming energy between the outgoing baryonic and ingoing dark matter. The individual densities ρb of baryons and ρd of dark matter (and ρe of electromagnetic energy) increase only modestly. As in the case of the massless scalar field considered in the previous subsection, §17.3.9, the smaller the

307

dSb / dSBH

horizon

(Planck units)

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

mass inflation

17.3 Self-similar models of the interior structure of black holes

|C| ρ

ρ ρb e

ρd

1010

1020 1030 1040 Radius r (Planck units)

1050

Figure 17.9 Back to the relativistic charged baryonic plasma, but now in addition the black hole accretes neutral pressureless uncharged dark matter, which streams freely through the baryonic plasma. The relativistic counter-streaming produces mass inflation just above the erstwhile inner horizon. The mass is M• = 4 × 106 M⊙ , the accretion rate M˙ • = 10−16 , the baryonic equation of state wb = 0.32, the charge-tomass Q• /M• = 10−5 , the conductivity is zero, and the ratio of dark matter to baryonic density at the outer sonic point is ρd /ρb = 0.1.

accretion rate, the shorter the length scale of inflation. Not only that, but the smaller one of the ingoing or outgoing streams is relative to the other, the shorter the length scale of inflation. Figure 17.10 shows a black hole with three different ratios of the dark-matter-to-baryon density ratio at the sonic point, ρd /ρb = 0.3, 0.1, and 0.03, all with the same total accretion rate M˙ • = 10−2 . The smaller the dark matter stream, the faster is inflation. The accretion rate M˙ • and the dark-matter-to-baryon ratio ρd /ρb have been chosen to be relatively large so that the inflationary growth rate is discernable easily on the graph. Figure 17.10 shows that, as in Figure 17.9, almost all the center-of-mass energy is in the streaming energy between the baryons and the dark matter. For one case, ρd /ρb = 0.3, Figure 17.10 shows the individual densities ρb of baryons, ρd of dark matter, and ρe of electromagnetic energy, all of which remain tiny compared to the streaming energy.

17.3.11 The black hole particle accelerator The previous subsection, §17.3.10, showed that almost all the center-of-mass energy during mass inflation is in the energy of counter-streaming. Thus the black hole acts like an extravagantly powerful particle accelerator. Mass inflation is an exponential instability. The nature of the black hole particle accelerator is that an

The interiors of spherical black holes 0.03

0.1 0.

3

inner horizon

(Planck units)

1020 1010 100 10−10 10−20 10−30 10−40 10−50 10−60 10−70 10−80 10−90 10−100 10−110 10−120

ρe

horizon

308

ρ

|C|

ρd ρb

.1

.2 .5 1 2 Radius r (geometric units)

Figure 17.10 The center-of-mass density ρ and Weyl curvature |C| inside a black hole accreting baryons and dark matter at rate M˙ • = 0.01. The graph shows three cases, with dark-matter-to-baryon ratio at the sonic point of ρd /ρb = 0.3, 0.1, and 0.03. The smaller the ratio of dark matter to baryons, the faster the center-of-mass density ρ and curvature C inflate. For the largest ratio, ρd /ρb = 0.3 (to avoid confusion, only this case is plotted), the graph also shows the individual proper densities ρb of baryons, ρd of dark matter, and ρe of electromagnetic energy. During mass inflation, almost all the center-of-mass energy ρ is in the streaming energy: the proper densities of individual components remain small. The black hole mass is M• = 4 × 106 M⊙ , the baryonic equation of state is wb = 0.32, the charge-to-mass is Q• /M• = 0.8, and the conductivity is zero.

individual particle spends approximately an equal interval of proper time being accelerated through each decade of collision energy. Each baryon in the black hole collider sees a flux nd ur of dark matter particles per unit area per unit time, where nd = ρd /md is the proper number density of dark matter particles in their own frame, and ur is the radial component of the proper 4-velocity, the γv, of the dark matter through the baryons. The γ factor in ur is the relavistic beaming factor: all frequencies, including the collision frequency, are speeded up by the relativistic beaming factor γ. As the baryons accelerate through the collider, they spend a proper time interval dτ /d ln ur in each e-fold of Lorentz factor ur . The number of collisions per baryon per e-fold of ur is the dark matter flux (ρd /md )ur , multiplied by the time dτ /d ln ur , multiplied by the collision cross-section σ. The total cumulative number of collisions that have happened in the black hole particle collider equals this multiplied by the total number of baryons that have fallen into the black hole, which is approximately equal to the black hole mass M• divided by the mass mb per baryon. Thus the total cumulative number of

˙ /M ) Collision rate ρd u dτ / dlnu (M • •

17.3 Self-similar models of the interior structure of black holes

309

10

1 0.03

10−1

0.01 0.003 10−16

10−2 1050 Velocity u

100

10100

Figure 17.11 Collision rate of the black hole particle accelerator per e-fold of velocity u (meaning γv), expressed in units of the inverse black hole accretion time M˙ • /M• . The models illustrated are the same as those in Figure 17.10. The curves are labeled with their mass accretion rates: M˙ • = 0.03, 0.01, 0.003, and 10−16 . Stars mark where the center-of-mass energy of colliding baryons and dark matter particles exceeds the Planck energy, while disks show where the Weyl curvature scalar C exceeds the Planck scale.

collisions in the black hole collider is number of collisions dτ M • ρd σur . = e-fold of ur mb md d ln ur

(17.28)

Figure 17.11 shows, for several different accretion rates M˙ • , the collision rate M• ρd ur dτ /d ln ur of the black hole collider, expressed in units of the black hole accretion rate M˙ • . This collision rate, multiplied by M˙ • σ/(md mb ), gives the number of collisions (17.28) in the black hole. In the units c = G = 1 being used here, the mass of a baryon (proton) is 1 GeV ≈ 10−54 m. If the cross-section σ is expressed in canonical accelerator units of femtobarns (1 fb = 10−43 m2 ) then the number of collisions (17.28) is  σ  number of collisions = 1045 r e-fold of u 1 fb



300 GeV2 mb md



M˙ • 10−16

!

M• ρd ur dτ /d ln ur ˙• 0.03 M



.

(17.29)

Particle accelerators measure their cumulative luminosities in inverse femtobarns. Equation (17.29) shows that the black hole accelerator delivers about 1045 femtobarns, and it does so in each e-fold of collision energy up to the Planck energy and beyond.

310

The interiors of spherical black holes

17.4 Instability at outer horizon? A number of papers have suggested that a magical phase transition at, or just outside, the outer horizon prevents any horizon from forming. Is it true? For example, is there a mass inflation instability at the outer horizon? If the were a White Hole on the other side of the outer horizon, then indeed an object entering the outer horizon would encounter an inflationary instability. But otherwise, no.

PART SEVEN GENERAL RELATIVISTIC PERTURBATION THEORY

Concept Questions

1. Why do general relativistic perturbation theory using the tetrad formalism as opposed to the coordinate approach? 2. Why is the tetrad metric γmn assumed fixed in the presence of perturbations? 3. Are the tetrad axes γm fixed under a perturbation? 4. Is it true that the tetrad components ϕmn of a perturbation are (anti-)symmetric in m ↔ n if and only if its coordinate components ϕµν are (anti-)symmetric in µ ↔ ν? 0

5. Does an unperturbed quantity, such as the unperturbed metric g µν , change under an infinitesimal coordinate gauge transformation? 6. How can the vierbein perturbation ϕmn be considered a tetrad tensor field if it changes under an infinitesimal coordinate gauge transformations? 7. What properties of the unperturbed spacetime allow decomposition of perturbations into independently evolving Fourier modes? 8. What properties of the unperturbed spacetime allow decomposition of perturbations into independently evolving scalar, vector, and tensor modes? 9. In what sense do scalar, vector, and tensor modes have spin 0, 1, and 2 respectively? 10. Tensor modes represent gravitational waves that, in vacuo, propagate at the speed of light. Do scalar and vector modes also propagate at the speed of light in vacuo? If so, do scalar and vector modes also constitute gravitational waves? 11. If scalar, vector, and tensor modes evolve independently, does that mean that scalar modes can exist and evolve in the complete absence of tensor modes? If so, does it mean that scalar modes can propagate causally, in vacuo at the speed of light, without any tensor modes being present? 12. Equation (20.74) defines the mass M of a body as what a distant observer would measure from its gravitational potential. Similarly equation (20.82) defines the angular momentum L of a body as what a distant observer would measure from the dragging of inertial frames. In what sense are these definitions legitimate? 13. Can an observer far from a body detect the difference between the scalar potentials Ψ and Φ produced by the body?

314

Concept Questions

14. If a gravitational wave is a wave of spacetime itself, distorting the very rulers and clocks that measure spacetime, how is it possible to measure gravitational waves at all? 15. Have gravitational waves been detected? 16. If gravitational waves carry energy-momentum, then can gravitational waves be present in a region of spacetime with vanishing energy-momentum tensor, Tmn = 0?

What’s important?

1. Getting your brain around coordinate and tetrad gauge transformations. 2. A central aim of general relativistic perturbation theory is to identify the coordinate and tetrad gaugeinvariant perturbations, since only these have physical meaning. 3. A second central aim is to classify perturbations into independently evolving modes, to the extent that this is possible. 4. In background spacetimes with spatial translation and rotation symmetry, which includes Minkowski space and the Friedmann-Roberston-Walker metric of cosmology, modes decompose into independently evolving scalar (spin-0), vector (spin-1), and tensor (spin-2) modes. In background spacetimes without spatial translation and rotation symmetry, such as black holes, scalar, vector, and tensor modes scatter off the curvature of space, and therefore mix with each other. 5. In background spacetimes with spatial translation and rotation symmetry, there are 6 algebraic combinations of metric coefficients that are coordinate and tetrad gauge-invariant, and therefore represent physical perturbations. There are 2 scalar modes, 2 vector modes, and 2 tensor modes. A spin-m mode varies as eimχ where χ is the rotational angle about the spatial wavevector k of the mode. 6. In background spacetimes without spatial translation and rotation symmetry, the coordinate and tetrad gauge-invariant perturbations are not algebraic combinations of the metric coefficients, but rather combinations that involve first and second derivatives of the metric coefficients. Gravitational waves are described by the Weyl tensor, which can be decomposed into 5 complex components, with spin 0, ±1, and ±2. The spin-±2 components describe propagating gravitational waves, while the spin-0 and spin-±1 components describe the non-propagating gravitational field near a source. 7. The preeminent application of general relativistic perturbation theory is to cosmology. Coupled with physics that is either well understood (such as photon-electron scattering) or straightforward to model even without a deep understanding (such as the dynamical behavior of non-baryonic dark matter and dark energy), the theory has yielded predictions that are in spectacular agreement with observations of fluctuations in the CMB and in the large scale distribution of galaxies and other tracers of the distribution of matter in the Universe.

18 Perturbations and gauge transformations

This chapter sets up the basics equations that define perturbations to an arbitrary spacetime in the tetrad formalism of general relativity, and it examines the effect of tetrad and coordinate gauge transformations on those perturbations. The perturbations are supposed to be small, in the sense that quantities quadratic in the perturbations can be neglected. The formalism set up in this chapter provides a foundation used in subsequent chapters.

18.1 Notation for perturbations A 0 (zero) overscript signifies an unperturbed quantity, while a 1 (one) overscript signifies a perturbation. No overscript means the full quantity, including both unperturbed and perturbed parts. An overscript is attached only where necessary. Thus if the unperturbed part of a quantity is zero, then no overscript is needed, and none is attached.

18.2 Vierbein perturbation Let the vierbein perturbation ϕmn be defined so that the perturbed vierbein is 0

(18.1)

0

(18.2)

n + ϕm n )en µ , em µ = (δm

with corresponding inverse em µ = (δnm − ϕn m )en µ .

Since the perturbation ϕm n is already of linear order, to linear order its indices can be raised and lowered with the unperturbed metric, and transformed between tetrad and coordinate frames with the unperturbed vierbein. In practice it proves convenient to work with the covariant tetrad-frame components ϕmn of the vierbein pertubation ϕmn = γnl φm l .

(18.3)

18.3 Gauge transformations

317

The perturbation ϕmn can be regarded as a tetrad tensor field defined on the unperturbed background.

18.3 Gauge transformations The vierbein perturbation ϕmn has 16 degrees of freedom, but only 6 of these degrees of freedom correspond to real physical perturbations, since 6 degrees of freedom are associated with arbitrary infinitesimal changes in the choice of tetrad, which is to say arbitrary infinitesimal Lorentz transformations, and a further 4 degrees of freedom are associated with arbitrary infinitesimal changes in the coordinates. In the context of perturbation theory, these infinitesimal tetrad and coordinate transformations are called gauge transformations. Real physical perturbations are perturbations that are gauge-invariant under both tetrad and coordinate gauge transformations.

18.4 Tetrad metric assumed constant In the tetrad formalism, tetrad axes γm are introduced as locally inertial (or other physically motivated) axes attached to an observer. The axes enable quantities to be projected into the frame of the observer. In a spacetime buffeted by perturbations, it is natural for an observer to cling to the rock provided by the locally inertial (or other) axes, as opposed to allowing the axes to bend with the wind. For example, when a gravitational wave goes by, the tidal compression and rarefaction causes the proper distance between two freely falling test masses to oscillate. It is natural to choose the tetrad so that it continues to measure proper times and distances in the perturbed spacetime. In these notes on general relativistic perturbation theory, the tetrad metric will be taken to be constant everywhere, and unchanged by a perturbation 0

γmn = γ mn = constant .

(18.4)

For example, if the tetrad is orthonormal, then the tetrad metric is constant, the Minkowski metric ηmn . However, the tetrad could also be some other tetrad for which the tetrad metric is constant, such as a spinor tetrad (§12.1.1), or a Newman-Penrose tetrad (§12.2.1).

18.5 Perturbed coordinate metric The perturbed coordinate metric is gµν = γmn em µ en ν 0

0

k = γkl (δm − ϕm k )em µ (δnl − ϕn l )en ν 0

= g µν − (ϕµν + ϕνµ ) .

(18.5)

318

Perturbations and gauge transformations

Thus the perturbation of the coordinate metric depends only on the symmetric part of the vierbein perturbation ϕmn , not the antisymmetric part 1

g µν = − (ϕµν + ϕνµ ) .

(18.6)

18.6 Tetrad gauge transformations Under an infinitesimal tetrad transformation, the covariant vierbein perturbations ϕmn transform as ϕmn → ϕmn + ǫmn ,

(18.7)

where ǫmn is the generator of a Lorentz transformation, which is to say an arbitrary antisymmetric tensor (Exercise 11.2). Thus the antisymmetric part ϕmn − ϕnm of the covariant perturbation ϕmn is arbitrarily adjustable through an infinitesimal tetrad transformation, while the symmetric part ϕmn + ϕnm is tetrad gauge-invariant. It is easy to see when a quantity is tetrad gauge-invariant: it is tetrad gauge-invariant if and only if it depends only on the symmetric part of the vierbein perturbation, not on the antisymmetric part. Evidently the perturbation (18.6) to the coordinate metric gµν is tetrad gauge-invariant. This is as it should be, since the coordinate metric gµν is a coordinate-frame quantity, independent of the choice of tetrad frame. If only tetrad gauge-invariant perturbations are physical, why not just discard tetrad perturbations (the antisymmetric part of ϕmn ) altogether, and work only with the tetrad gauge-invariant part (the symmetric part of ϕmn )? The answer is that tetrad-frame quantities such as the tetrad-frame Einstein tensor do change under tetrad gauge transformations (infinitesimal Lorentz transformations of the tetrad). It is true that the only physical perturbations of the Einstein tensor are those combinations of it that are tetrad gaugeinvariant. But in order to identify these tetrad gauge-invariant combinations, it is necessary to carry through the dependence on the non-tetrad-gauge-invariant part, the antisymmetric part of ϕmn . Much of the professional literature on general relativistic perturbation theory works with the traditional coordinate formalism, as opposed to the tetrad formalism. The term “gauge-invariant” then means coordinate gauge-invariant, as opposed to both coordinate and tetrad gauge-invariant. This is fine as far as it goes: the coordinate approach is perfectly able to identify physical perturbations versus gauge perturbations. However, there still remains the problem of projecting the perturbations into the frame of an observer, so ultimately the issue of perturbations of the observer’s frame, tetrad perturbations, must be faced.

Concept question 18.1

In perturbation theory, can tetrad gauge transformations be non-infinitesimal?

18.7 Coordinate gauge transformations

319

18.7 Coordinate gauge transformations A coordinate gauge transformation is a transformation of the coordinates xµ by an infinitesimal shift ǫµ xµ → x′µ = xµ + ǫµ .

(18.8)

You should not think of this as shifting the underlying spacetime around; rather, it is just a change of the coordinate system, which leaves the underlying spacetime unchanged. Because the shift ǫµ is, like the vierbein perturbations ϕmn , already of linear order, its indices can be raised and lowered with the unperturbed metric, and transformed between coordinate and tetrad frames with the unperturbed vierbein. Thus the shift ǫµ can be regarded as a vector field defined on the unperturbed background. The tetrad components ǫm of the shift ǫµ are 0

ǫm = em µ ǫµ .

(18.9)

Physically, the tetrad-frame shift ǫm is the shift measured in locally inertial coordinates ξ m → ξ ′m = ξ m + ǫm .

(18.10)

18.8 Coordinate gauge transformation of a coordinate scalar Under a coordinate transformation (18.8), a coordinate-frame scalar Φ(x) remains unchanged Φ(x) → Φ′ (x′ ) = Φ(x) .

(18.11)

Here the scalar Φ′ (x′ ) is evaluated at position x′ , which is the same as the original physical position x since all that has changed is the coordinates, not the physical position. However, in perturbation theory, quantities are evaluated at coordinate position x, not x′ . The value of Φ at x is related to that at x′ by Φ′ (x) = Φ′ (x′ − ǫ) = Φ′ (x′ ) − ǫκ

∂Φ′ . ∂xκ

(18.12)

Since ǫκ is a small quantity, and Φ′ differs from Φ by a small quantity, the last term ǫκ ∂Φ′ /∂xκ in equation (18.16) can be replaced by ǫκ ∂Φ/∂xκ to linear order. Putting equations (18.11) and (18.12) together shows that the scalar Φ changes under a coordinate gauge transformation (18.8) as Φ(x) → Φ′ (x) = Φ(x) − ǫκ

∂Φ . ∂xκ

(18.13)

The transformation (18.13) can also be written Φ(x) → Φ′ (x) = Φ(x) + Lǫ Φ , where Lǫ is the Lie derivative, §18.13.

(18.14)

320

Perturbations and gauge transformations

18.9 Coordinate gauge transformation of a coordinate vector or tensor A similar argument applies to coordinate vectors and tensors. Under a coordinate transformation (18.8), a coordinate-frame 4-vector Aµ (x) transforms in the usual way as ∂x′µ ∂ǫµ µ κ = A (x) + A (x) . (18.15) ∂xκ ∂xκ As in the scalar case, the vector A′µ (x′ ) is evaluated at position x′ , which is the same as the original physical position x since all that has changed is the coordinates, not the physical position. Again, in perturbation theory, quantities are evaluated at coordinate position x, not x′ . The value of A′µ at x is related to that at x′ by ∂A′µ A′µ (x) = A′µ (x′ − ǫ) = A′µ (x′ ) − ǫκ . (18.16) ∂xκ The last term ǫκ ∂A′µ /∂xκ in equation (18.16) can be replaced by ǫκ ∂Aµ /∂xκ to linear order. Putting equations (18.15) and (18.16) together shows that the 4-vector Aµ changes under a coordinate gauge transformation (18.8) as Aµ (x) → A′µ (x′ ) = Aκ (x)

Aµ (x) → A′µ (x) = Aµ (x) + Aκ

µ ∂ǫµ κ ∂A − ǫ . ∂xκ ∂xκ

(18.17)

The transformation (18.17) can also be written Aµ (x) → A′µ (x) = Aµ (x) + Lǫ Aµ ,

(18.18)

where Lǫ is the Lie derivative, §18.13. More generally, under a coordinate gauge transformation (18.8), a coordinate tensor Aκλ... µν... transforms as, equation (18.32), ′κλ... κλ... κλ... Aκλ... µν... (x) → A µν... (x) = Aµν... (x) + Lǫ Aµν... .

(18.19)

18.10 Coordinate gauge transformation of a tetrad vector A tetrad-frame 4-vector Am is a coordinate-invariant quantity, and therefore acts like a coordinate scalar, equation (18.13), under a coordinate gauge transformation (18.8) Am (x) → A′m (x) = Am (x) − ǫκ

∂Am = Am (x) − ǫk ∂k Am . ∂xκ

(18.20)

The change −ǫk ∂k Am is a coordinate tensor (specifically, a coordinate scalar), but it is not a tetrad tensor. More generally, a tetrad-frame tensor Akl... mn... transforms under a coordinate gauge transformation (18.8) as ′kl... kl... a kl... Akl... mn... (x) → A mn... (x) = Amn... (x) − ǫ ∂a Amn... .

Again, the change −ǫa ∂a Akl... mn... is a coordinate tensor (a coordinate scalar), but not a tetrad tensor.

(18.21)

18.11 Coordinate gauge transformation of the vierbein

321

18.11 Coordinate gauge transformation of the vierbein The inverse vierbein em µ equals the scalar product of the tetrad and coordinate axes, em µ = γ m · gµ . Therefore the transformation of the vierbein under a coordinate gauge transformation (18.8) follows from the transformations of γ m and gµ . The tetrad axes γ m transform in accordance with (18.20) as γ m → γ ′m = γ m − ǫk ∂k γ m

n = γ m + ǫ k Γm nk γ .

(18.22)

ν The coordinate axes gµ transform in accordance with (18.32) and (18.34) as (including torsion Sµκ )

gµ → gµ′ = gµ + Lǫ gµ

ν κ = gµ − gν Dµ ǫν − ǫν Dν gµ − gν Sµκ ǫ ν κ = gµ − Dµ (gν ǫν ) − gν Sµκ ǫ  n k 0m = gµ − γn Dm ǫn + Smk ǫ e µ,

(18.23)

where the third line follows from the second because the axes gν are by definition covariantly constant, Dµ gν = 0. It follows from (18.22) and (18.23) that the inverse vierbein em µ transforms under an infinitesimal coordinate gauge transformation (18.8) as em µ → e′m µ = γ ′m · gµ′

 m k k 0n e µ. = em µ + − Dn ǫm − Snk ǫ + Γm nk ǫ

(18.24)

From equation (18.24) and the definition (18.2) of the vierbein perturbations ϕmn , it follows that the vierbein perturbations transform under a coordinate gauge transformation (18.8) as ϕmn → ϕ′mn = ϕmn + ∂m ǫn − (Γknm + Γnmk − Snmk ) ǫk ,

(18.25)

in which ǫk are the tetrad components of the coordinate shift, and Γkmn are tetrad connection coefficients. If torsion vanishes, Snmk = 0, as general relativity assumes, then the transfomation (18.25) of the vierbein perturbations under a coordinate gauge transformation reduces to ϕmn → ϕ′mn = ϕmn + ∂m ǫn − (Γknm + Γnmk ) ǫk .

(18.26)

18.12 Coordinate gauge transformation of the metric The tetrad metric γmn transforms under an infinitesimal coordinate gauge transformation (18.8) as ′ γmn → γmn = γmn − (Γmnk + Γnmk )ǫk = γmn ,

(18.27)

where the last expression is true because the tetrad metric γmn is being assumed constant, equation (18.4), in which case Γmnk + Γnmk = ∂k γnm = 0.

322

Perturbations and gauge transformations

The coordinate metric gµν transforms under an infinitesimal coordinate gauge transformation (18.8) as ′ gµν → gµν = gµν + Lǫ gµν = gµν − (Dµ ǫν + Dν ǫµ ) − (Sµνκ + Sνµκ )ǫκ .

(18.28)

18.13 Lie derivative The change in the coordinate 4-vector Aµ on the right hand side of equation (18.17) is called the Lie derivative of Aµ along the direction ǫκ , and it is designated by the operator Lǫ

∂ǫµ ∂Aµ − ǫκ κ . (18.29) κ ∂x ∂x The Lie derivative has the important property of being a tensor, which is one reason that it merits a special name. As its name suggests, the Lie derivative acts like a derivative: it is linear, and it satisfies the Leibniz rule. Translating from ordinary partial derivatives to covariant derivatives yields the following expression for the Lie derivative in covariant form Lǫ Aµ ≡ Aκ

µ Lǫ Aµ = Aκ Dκ ǫµ − ǫκ Dκ Aµ + Aκ ǫλ Sκλ

is a coordinate vector ,

(18.30)

µ is the torsion. If torsion vanishes, as GR assumes, then the Lie derivative of a 4-vector is where Sκλ

Lǫ Aµ = Aκ Dκ ǫµ − ǫκ Dκ Aµ

is a coordinate vector .

(18.31)

More generally, under a coordinate gauge transformation (18.8), a coordinate tensor Aκλ... µν... transforms as ′κλ... κλ... κλ... Aκλ... µν... (x) → A µν... (x) = Aµν... (x) + Lǫ Aµν...

(18.32)

where the Lie derivative is defined by λ α α ∂Aκλ... ∂ǫκ µν... κα... ∂ǫ κλ... ∂ǫ κλ... ∂ǫ α + A ... − A − A − ǫ . µν... αν... µα... α α µ ν ∂x ∂x ∂x ∂x ∂xα In covariant form, equation (18.33) is αλ... Lǫ Aκλ... µν... ≡ Aµν...

(18.33)

αλ... κ κα... λ κλ... α κλ... α α κλ... Lǫ Aκλ... (18.34) µν... = Aµν... Dα ǫ + Aµν... Dα ǫ ... − Aαν... Dµ ǫ − Aµα... Dν ǫ ... − ǫ Dα Aµν...  αλ... κ κα... λ κλ... α κλ... α β is a coordinate tensor , + Aµν... Sαβ + Aµν... Sαβ ... − Aαν... Sµβ − Aµα... Sνβ ǫ

which in the case of vanishing torsion, as GR assumes, reduces to

αλ... κ κα... λ κλ... α κλ... α α κλ... Lǫ Aκλ... µν... = Aµν... Dα ǫ + Aµν... Dα ǫ ... − Aαν... Dµ ǫ − Aµα... Dν ǫ ... − ǫ Dα Aµν... is a coordinate tensor . (18.35)

19 Scalar, vector, tensor decomposition

In the particular case that the unperturbed spacetime is spatially homogeneous and isotropic, which includes not only Minkoswki space but also the important case of the cosmological Friedmann-Robertson-Walker metric, perturbations decompose into independently evolving scalar (spin-0), vector (spin-1), and tensor (spin-2) modes. Similarly to Fourier decomposition, decomposition into scalar, vector, and tensor modes is non-local, in principle requiring knowledge of perturbation amplitudes simultaneously throughout all of space. In practical problems however, an adequate decomposition is possible as long as the scales probed are sufficiently larger than the wavelengths of the modes probed. Ultimately, the fact that an adequate decomposition is possible is a consequence of the fact that gravitational fluctuations in the real Universe appear to converge at the cosmological horizon, so that what happens locally is largely independent of what is happening far away.

19.1 Decomposition of a vector in flat 3D space Theorem: In flat 3-dimensional space, a 3-vector field w(x) can be decomposed uniquely (subject to the boundary condition that w vanishes sufficiently rapidly at infinity) into a sum of scalar and vector parts w = ∇wk + w⊥ scalar

vector

.

(19.1)

In this context, the term vector signifies a 3-vector w⊥ that is transverse, that is to say, it has vanishing divergence, ∇ · w⊥ = 0 .

(19.2)

Here ∇ ≡ ∂/∂x ≡ ∇i ≡ ∂/∂xi is the gradient in flat 3D space. The scalar and vector parts are also known as spin-0 and spin-1, or gradient and curl, or longitudinal and transverse. The scalar part ∇wk contains 1 degree of freedom, while the vector part w⊥ contains 2 degrees of freedom. Together they account for the 3 degrees of freedom of the vector w.

324

Scalar, vector, tensor decomposition

Proof: Take the divergence of equation (19.1) ∇ · w = ∇2 wk .

(19.3)

The operator ∇2 on the right hand side of equation (19.3) is the 3D Laplacian. The solution of equation (19.3) is Z ∇′ · w(x′ ) d3 x′ wk (x) = − . (19.4) |x′ − x| 4π The solution (19.4) is valid subject to boundary conditions that the vector w vanish sufficiently rapidly at infinity. In cosmology, the required boundary conditions, which are set at the Big Bang, are apparently satisfied because fluctuations at the Big Bang were small. Equation (19.1) then immediately implies that the vector part is w⊥ = w − ∇wk .

19.2 Fourier version of the decomposition of a vector in flat 3D space When the background has some symmetry, it is natural to expand perturbations in eigenmodes of the symmetry. If the background space is flat, then it is translation symmetric. Eigenmodes of the translation operator ∇ are Fourier modes. A function a(x) in flat 3D space and its Fourier transform a(k) are related by (the disposition of factors of 2π in the following definition follows the convention most commonly adopted by cosmologists) Z Z d3 k . (19.5) a(k) = a(x)eik·x d3 x , a(x) = a(k)e−ik·x (2π)3 You may not be familiar with the practice of using the same symbol a in both real and Fourier space; but a is the same vector in Hilbert space, with components ax = a(x) in real space, and ak = a(k) in Fourier space. Taking the gradient ∇ in real space is equivalent to multiplying by −ik in Fourier space ∇ → −ik .

(19.6)

Thus the decomposition (19.1) of the 3D vector w translates into Fourier space as w = −i k wk + w⊥ ,

(19.7)

k · w⊥ = 0 .

(19.8)

scalar

vector

where the vector part w⊥ satisfies In other words, in Fourier space the scalar part ∇wk of the vector w is the part parallel (longitudinal) to the wavevector k, while the vector part w⊥ is the part perpendicular (transverse) to the wavevector k.

19.3 Decomposition of a tensor in flat 3D space

325

19.3 Decomposition of a tensor in flat 3D space Similarly, the 9 components of a 3 × 3 spatial matrix hij can be decomposed into 3 scalars, 2 vectors, and 1 tensor: ˜ i + hT . ˜ + ∇i hj + ∇j h (19.9) hij = δij φ + ∇i ∇j h + εijk ∇k h ij scalar

scalar

In this context, the term tensor signifies a 3 × 3 matrix hT ii = 0 ,

vector

scalar

hTij

hTij = hTji ,

vector

tensor

that is traceless, symmetric, and transverse: ∇i hTij = 0 .

(19.10)

The transverse-traceless-symmetric matrix hTij has two degrees of freedom. The vector components hi and ˜ i are by definition transverse, h ˜i = 0 . (19.11) ∇i hi = ∇i h ˜ and h ˜ i simply distinguish those symbols; the tildes have no other significance. The trace of The tildes on h the 3 × 3 matrix hij is hii = 3φ + ∇2 h .

Spinor decomposition.

(19.12)

20 Flat space background

General relativistic perturbation theory is simplest in the case that the unperturbed background space is Minkowski space. In Cartesian coordinates xµ ≡ {t, x, y, z}, the unperturbed coordinate metric is the Minkowski metric 0

gµν = ηµν .

(20.1)

In this chapter the tetrad is taken to be orthonormal, and aligned with the unperturbed coordinate axes, so that the unperturbed vierbein is the unit matrix 0

µ . e m µ = δm

(20.2)

Let overdot denote partial differentiation with respect to time t, overdot ≡

∂ , ∂t

(20.3)

and let ∇ denote the spatial gradient ∇≡

∂ ∂ . ≡ ∇i ≡ ∂x ∂xi

Sometimes it will also be convenient to use ∇m to denote the 4-dimensional spacetime derivative n∂ o ∇m ≡ ,∇ . ∂t

(20.4)

(20.5)

20.1 Classification of vierbein perturbations The aims of this section are two-fold. First, decompose perturbations into scalar, vector, and tensor parts. Second, identify the coordinate and tetrad gauge-invariant perturbations. It will be found, equations (20.13), that there are 6 coordinate and tetrad gauge-invariant perturbations, comprising 2 scalars Ψ and Φ, 1 vector Wi containing 2 degrees of freedom, and 1 tensor hij containing 2 degrees of freedom.

20.1 Classification of vierbein perturbations

327

The vierbein perturbations ϕmn decompose into 6 scalars, 4 vectors, and 1 tensor ϕtt = ψ

,

(20.6a)

scalar

ϕti = ∇i w + wi ,

(20.6b)

ϕit = ∇i w ˜+ w ˜i ,

(20.6c)

scalar

scalar

vector

vector

˜ i + hij . ˜ + ∇i hj + ∇j h ϕij = δij Φ + ∇i ∇j h + εijk ∇k h scalar

vector

scalar

scalar

vector

(20.6d)

tensor

˜ simply distinguish those symbols; the tildes have no other significance. The vector The tildes on w ˜ and h components are by definition transverse (have vanishing divergence), while the tensor component hij is by definition traceless, symmetric, and transverse. For a single Fourier mode whose wavevector k is taken without loss of generality to lie in the z-direction, equations (20.6) are

ϕmn

ψ  w ˜ x =  w ˜y ∇z w ˜

wx Φ + hxx ˜ hxy − ∇z h ∇z hx



wy ˜ hxy + ∇z h Φ − hxx ∇z hy

 ∇z w ˜x  ∇z h  ˜y  . ∇z h Φ + ∇2z h

(20.7)

To identify coordinate gauge-invariant quantities, it is necessary to consider infinitesimal coordinate gauge transformations (18.8). The tetrad-frame components ǫm of the coordinate shift of the coordinate gauge transformation decompose into 2 scalars and 1 vector ǫm = { ǫt

scalar

,

∇i ǫ scalar

+ ǫi } .

(20.8)

vector

In the flat space background space being considered the coordinate gauge transformation (18.26) of the vierbein perturbation simplifies to ϕmn → ϕ′mn = ϕmn + ∇m ǫn .

(20.9)

In terms of the scalar, vector, and tensor potentials introduced in equations (20.6), the gauge transformations (20.9) are ϕtt → ψ + ǫ˙t ,

(20.10a)

scalar

ϕti → ∇i (w + ǫ) ˙ + (wi + ǫ˙i ) ,

(20.10b)

ϕit → ∇i (w˜ + ǫt ) + w ˜i ,

(20.10c)

scalar

scalar

vector

vector

˜ i + hij . ˜ + ∇i (hj + ǫj ) + ∇j h ϕij → δij Φ + ∇i ∇j (h + ǫ) + εijk ∇k h scalar

scalar

scalar

vector

vector

tensor

(20.10d)

328

Flat space background

Equations (20.10a) imply that under an infinitesimal coordinate gauge transformation ψ → ψ + ǫ˙t ,

w → w + ǫ˙ ,

w ˜→w ˜ + ǫt ,

Φ→Φ,

(20.11a) wi → wi + ǫ˙i ,

(20.11b)

w ˜i → w ˜i , h → h + ǫ , ˜h → ˜h ,

(20.11c) hi → hi + ǫ i ,

˜hi → h ˜i ,

hij → hij .

(20.11d)

Eliminating the coordinate shift ǫm from the transformations (20.11) yields 12 coordinate gauge-invariant combinations of the potentials ψ−w ˜˙ ,

w − h˙ ,

wi − h˙ i ,

w ˜i ,

Φ,

˜ , h

˜i , h

hij .

(20.12)

Physical perturbations are not only coordinate but also tetrad gauge-invariant. A quantity is tetrad gaugeinvariant if and only if it depends only on the symmetric part of the vierbein pertubations, not on the antisymmetric part, §18.6. There are 6 combinations of the coordinate gauge-invariant perturbations (20.12) that are symmetric, and therefore not only coordinate but also tetrad gauge-invariant. These 6 coordinate and tetrad gauge-invariant perturbations comprise 2 scalars, 1 vector, and 1 tensor Ψ

¨ , ≡ ψ − w˙ − w ˜˙ + h

(20.13a)

Φ

,

(20.13b)

Wi vector

˙ ≡ wi + w ˜i − h˙ i − ˜hi ,

(20.13c)

hij tensor

.

(20.13d)

scalar

scalar

Since only the 6 tetrad and coordinate gauge-invariant potentials Ψ, Φ, Wi , and hij have physical significance, it is legitimate to choose a particular gauge, a set of conditions on the non-gauge-invariant potentials, arranged to simplify the equations, or to bring out some physical aspect. Three gauges considered later are harmonic gauge (§20.7), Newtonian gauge (§20.9), and synchronous gauge (§20.10). However, for the next several sections, no gauge will be chosen: the exposition will continue to be completely general.

20.2 Metric, tetrad connections, and Einstein and Weyl tensors This section gives expressions in a completely general gauge for perturbed quantities in flat background Minkowski space.

20.2 Metric, tetrad connections, and Einstein and Weyl tensors

329

The perturbed coordinate metric gµν , equation (18.5), is gtt = −(1 + 2 ψ) ,

(20.14a)

gti = − ∇i (w + w) ˜ − (wi + w ˜i ) ,

(20.14b)

˜ j ) − ∇j (hi + h ˜ i ) − 2 hij . gij = δij (1 − 2 Φ) − 2 ∇i ∇j h − ∇i (hj + h

(20.14c)

The coordinate metric is tetrad gauge-invariant, but not coordinate gauge-invariant. The perturbed tetrad connections Γkmn are Γtit = − ∇i (ψ − w) ˜˙ + w ˜˙ i , ˙ − 1 (∇i Wj + ∇j Wi ) + ∇j w ˙ − ∇i ∇j (w − h) Γtij = δij Φ ˜i + h˙ ij , 2 ∂ ˜i) , ˜ j + ∇j h Γijt = 12 (∇i Wj − ∇j Wi ) − (εijl ∇l ˜h − ∇i h ∂t ˜ i ) + ∇i hjk − ∇j hik . ˜ j + ∇j h ˜ − ∇i h Γijk = (δjk ∇i − δik ∇j )Φ − ∇k (εijl ∇l h

(20.15a) (20.15b) (20.15c) (20.15d)

Being purely tetrad-frame quantities, the tetrad connections are automatically coordinate gauge-invariant. However, they are not tetrad gauge-invariant, as is evident from the fact that they (all) depend on antisymmetric parts of the vierbein perturbations ϕmn . One of the advantages of working with tetrads is that tetrad-frame quantities such as the tetrad connections Γkmn and the tetrad-frame Riemann tensor Rklmn are by construction independent of the choice of coordinates, and are therefore automatically coordinate gauge-invariant. In the tetrad formalism, you do not have to work too hard (is that really ever true?) to construct coordinate gauge-invariant combinations of the vierbein perturbations ϕmn : the tetrad-frame connections and Riemann tensor will automatically give you the coordinate gauge-invariant combinations. You can check that in the present case the tetrad connections (20.15) depend only on, and on all 12 of, the coordinate gauge-invariant combinations (20.12). The perturbed tetrad-frame Einstein tensor Gmn is Gtt = 2 ∇2 Φ ,

(20.16a)

scalar

Gti = 2 ∇i Φ˙ + scalar

1 2

∇2 Wi ,

(20.16b)

vector

¨ − (∇i ∇j − δij ∇2 )(Ψ − Φ) + Gij = 2 δij Φ scalar

scalar

1 2

˙ i ) − hij , ˙ j + ∇j W (∇i W vector

(20.16c)

tensor

where  is the d’Alembertian, the 4-dimensional wave operator  ≡ −∇m ∇m =

∂2 − ∇2 . ∂t2

(20.17)

Being a tetrad-frame quantity, the tetrad-frame Einstein tensor is automatically coordinate gauge-invariant. Equations (20.16) show that the tetrad-frame Einstein tensor Gmn is also tetrad gauge-invariant, since it depends only on the tetrad-gauge invariant combinations (20.13) of the vierbein perturbations. The property

330

Flat space background

that the Einstein tensor is tetrad as well as coordinate gauge-invariant is a feature of empty background space, and does not persist to more general spacetimes, such as the Friedmann-Robertson-Walker spacetime. In a frame with the wavector k taken along the z-axis, the perturbed Einstein tensor is   1 1 2 2 2 ∇2z Φ 2 ∇z Φ˙ 2 ∇z Wx 2 ∇z Wy  1  1  ∇2 Wx 2 Φ ˙  ¨ + ∇2 (Ψ − Φ) − h+ − h× z  2 z 2 ∇z Wx  Gmn =  1 (20.18)  ˙y  ¨ + ∇2z (Ψ − Φ) + h+ 1 ∇z W  ∇2z Wy − h 2 Φ × 2 2   1 1 ˙ ˙x ˙y ¨ 2 ∇z Φ W W 2 Φ ∇ ∇ z z 2 2

where h+ and h× are the two polarizations of gravitational waves, discussed further in §20.14, h+ ≡ hxx = − hyy ,

h× ≡ hxy = hyx .

(20.19)

The tetrad-frame complexified Weyl tensor is C˜titj = 14 (∇i ∇j −

2 1 3 δij ∇ )(Ψ scalar

+ Φ)

  + 81 − (∇i W˙ j + ∇j W˙ i ) + i(εikl ∇j + εjkl ∇i )∇k Wl vector

+

1 4



 ¨ ij − εikl εjmn ∇k ∇m hln − i(εikl ∇k h˙ jl + εjkl ∇k h˙ il ) . h

(20.20)

tensor

Like the tetrad-frame Einstein tensor, the tetrad-frame Weyl tensor is both coordinate and tetrad gaugeinvariant, depending only on the coordinate and tetrad gauge-invariant potentials Ψ, Φ, Wi , and hij .

20.3 Spinor components of the Einstein tensor Scalar, vector, and tensor perturbations correspond respectively to perturbations of spin 0, 1, and 2. An object has spin m if it is unchanged by a rotation of 2π/m about a prescribed direction. In perturbed Minkowski space, the prescribed direction is the direction of the wavevector k in the Fourier decomposition of the modes. The spin components may be projected out by working in a spinor tetrad, §12.1.1. In a frame where the wavevector k is taken along the z-axis, the spinor components of the perturbed Einstein tensor Gmn are (compare equations (20.16)) Gtz = 2 ∇z Φ˙ ,

Gtt = 2 ∇2z Φ ,

spin-0

spin-0

¨ , Gzz = 2 Φ spin-0

G+− − Gzz = ∇2z (Ψ − Φ) ,

(20.21a) (20.21b)

spin-0

Gt± =

2 1 2 ∇z W± spin-±1

,

G±± = − h±± , spin-±2

Gz± =

1 ˙ 2 ∇z W± spin-±1

,

(20.21c) (20.21d)

20.4 Too many Einstein equations?

331

where W± are the spin-±1 components of the vector perturbation Wi W± =

√1 2

(Wx ± i Wy ) ,

(20.22)

and h±± are the spin-±2 components of the tensor perturbation hij h±± = hxx ± i hxy = h+ ± i h× .

(20.23)

The spin +2 and −2 components h++ and h−− of the tensor perturbation are called the right- and left-handed circular polarizations. The spin +2 and −2 circular polarizations h++ and h−− have respective shapes ei2χ and e−i2χ , under a right-handed rotation by angle χ about the z-axis, which may be compared to the cos 2χ and sin 2χ shapes of the linear polarizations h+ and h× .

20.4 Too many Einstein equations? The Einstein equations are as usual (units c = G = 1) Gmn = 8πTmn .

(20.24)

There are 10 Einstein equations, but the Einstein tensor (20.16) depends on only 6 independent potentials: the two scalars Ψ and Φ, the vector Wi , and the tensor hij . The system of Einstein equations is thus overcomplete. Why? The answer is that 4 of the Einstein equations enforce conservation of energy-momentum, and can therefore be considered as governing the evolution of the energy-momentum as opposed to being equations for the gravitational potentials. For example, the form of equations (20.16a) and (20.16b) for Gtt and Gti enforces conservation of energy Dm Gmt = 0 ,

(20.25)

while the form of equations (20.16b) and (20.16c) for Gti and Gij enforces conservation of momentum Dm Gmi = 0 .

(20.26)

Normally, the equations governing the evolution of the energy-momentum Tamn of each species a of massenergy would be set up so as to ensure overall conservation of energy-momentum. If this is done, then the conservation equations (20.25) and (20.26) can be regarded as redundant. Since equations (20.25) and (20.26) are equations for the time evolution of Gtt and Gti , one might think that the Einstein equations for Gtt and Gti would become redundant, but this is not quite true. In fact the Einstein equations for Gtt and Gti impose constraints that must be satisfied on the initial spatial hypersurface. Conservation of energy-momentum guarantees that those constraints will continue to be satisfied on subsequent spatial hypersurfaces, but still the initial conditions must be arranged to satisfy the constraints. Because the Einstein equations for Gtt and Gti must be satisfied as constraints on the initial conditions, but thereafter can be ignored, the equations are called constraint equations. The Einstein equation for Gtt is called the energy constraint, or Hamiltonian constraint. The Einstein equations for Gti are called the momentum constraints.

332

Flat space background

Exercise 20.1 Energy and momentum constraints. Confirm the argument of this section. Suppose that the spatial Einstein equations are true, Gij = 8πT ij . Show that if the time-time and time-space Einstein equations Gtm = 8πT tm are initially true, then conservation of energy-momentum implies that these equations must necessarily remain true at all times. [Hint: Conservation of energy-momentum requires that Dm T mn = 0, and the Bianchi identities require that the Einstein tensor satisfies Dm Gmn = 0, so Dm (Gmn − 8πT mn ) = 0 .

(20.27)

By expanding out these equations in full, or otherwise, show that the solution satisfying Gij − 8πT ij = 0 at all times, and Gtm − 8πT tm = 0 initially, is Gtm − 8πT tm = 0 at all times.] Concept question 20.2 Which Einstein equations are redundant? It has been argued in this section that, if the energy-momentum tensor T mn is arranged to satisfy energy-conservation Dm T mn = 0 as it should, then the time-time and time-space Einstein equations must be satisfied by the initial conditions, but thereafter become redundant. Question: Can any 4 of the 10 Einstein equations be dropped, or just the time-time and time-space Einstein equations?

20.5 Action at a distance? The tensor component of the Einstein equations shows that, in a vacuum Tmn = 0, the tensor perturbations hij propagate at the speed of light, satisfying the wave equation hij = 0 .

(20.28)

The tensor perturbations represent propagating gravitational waves. It is to be expected that scalar and vector perturbations would also propagate at the speed of light, yet this is not obvious from the form of the Einstein tensor (20.16). Specifically, there are 4 components of the Einstein tensor (20.16) that apparently depend only on spatial derivatives, not on time derivatives. The 4 corresponding Einstein equations are ∇2 Φ = 4πTtt ,

(20.29a)

∇2 Wi = 16πTti ,

(20.29b)

∇2 (Ψ − Φ) = − 8πQij Tij ,

(20.29c)

scalar

vector

scalar

where Qij in equation (20.29c) is the quadrupole operator defined below, equation (20.87). These conditions must be satisfied everywhere at every instant of time, giving the impression that signals are traveling instantaneously from place to place.

20.6 Comparison to electromagnetism

333

20.6 Comparison to electromagnetism The previous two sections §§20.4, 20.5 brought up two issues: 1. There are 10 Einstein equations, but only 6 independent gauge-invariant potentials Ψ, Φ, Wi , and hij . The additional 4 Einstein equations serve to enforce conservation of energy-momentum. 2. Only 2 of the gauge-invariant potentials, the tensor potentials hij , satisfy causal wave equations. The remaining 4 gauge-invariant potentials Ψ, Φ, and Wi , satisfy equations (20.29) that depend on the instantaneous distribution of energy-momentum throughout space, on the face of it violating causality. These facts may seem surprising, but in fact the equations of electromagnetism have a similar structure, as will now be shown. In this section, the spacetime is assumed to be flat Minkowski space. The discussion in this section is based in part on the exposition by Bertschinger (1995 “Cosmological Dynamics,” 1993 Les Houches Lectures, arXiv:astro-ph/9503125). In accordance with the usual procedure, the electromagnetic field may be defined in terms of an electromagnetic 4-potential Am , whose time and spatial parts constitute the scalar potential φ and the vector potential A: Am ≡ {φ, A} .

(20.30)

The electric and magnetic fields E and B may be defined in terms of the potentials φ and A by E ≡ −∇φ −

∂A , ∂t

B ≡∇×A .

(20.31a) (20.31b)

Given their definition (20.31), the electric and magnetic fields automatically satisfy the two source-free Maxwell’s equations ∇·B = 0 , ∂B =0. ∇×E+ ∂t

(20.32a) (20.32b)

The remaining two Maxwell’s equations, the sourced ones, are ∇ · E = 4πq , ∂E = 4πj , ∇×B− ∂t

(20.33a) (20.33b)

where q and j are the electric charge and current density, the time and space components of the electric 4-current density j m j m ≡ {q, j} .

(20.34)

The electromagnetic potentials φ and A are not unique, but rather are defined only up to a gauge transformation by some arbitrary gauge field χ φ→φ+

∂χ , ∂t

A → A − ∇χ .

(20.35)

334

Flat space background

The gauge transformation (20.35) evidently leaves the electric and magnetic fields E and B, equations (20.31), invariant. Following the path of previous sections, §20.1 and thereafter, decompose the vector potential A into its scalar and vector parts A = ∇Ak + A⊥ , vector

scalar

(20.36)

in which the vector part by definition satisfies the transversality condition ∇ · A⊥ = 0. Under a gauge transformation (20.35), the potentials transform as ∂χ , ∂t Ak → Ak − χ , φ→φ+

(20.37a) (20.37b)

A⊥ → A⊥ .

(20.37c)

Eliminating the gauge field χ yields 3 gauge-invariant potentials, comprising 1 scalar Φ, and 1 vector A⊥ containing 2 degrees of freedom: Φ

scalar

A⊥ vector

≡ φ+

∂Ak , ∂t

.

(20.38a) (20.38b)

This shows that the electromagnetic field contains 3 independent degrees of freedom, consisting of 1 scalar and 1 vector. Concept question 20.3 Are gauge-invariant potentials Lorentz invariant? The potentials Φ and A⊥ , equations (20.38), are by construction gauge-invariant, but is this construction Lorentz invariant? Do Φ and A⊥ constitute the components of a 4-vector? In terms of the gauge-invariant potentials Φ and A⊥ , equations (20.38), the electric and magnetic fields are ∂A⊥ , ∂t B = ∇ × A⊥ . E = −∇Φ −

(20.39a) (20.39b)

The sourced Maxwell’s equations (20.33) thus become, in terms of Φ and A⊥ , −∇2 Φ = 4πq , scalar

(20.40a)

scalar

˙ + A⊥ = 4π∇jk + 4πj⊥ , ∇Φ

scalar

vector

scalar

vector

(20.40b)

20.6 Comparison to electromagnetism

335

where ∇jk and j⊥ are the scalar and vector parts of the current density j. Equations (20.40) bear a striking similarity to the Einstein equations (20.16). Only the vector part A⊥ satisfies a wave equation, A⊥ = 4πj⊥ ,

(20.41)

while the scalar part Φ satisfies an instantaneous equation (20.40a) that seemingly violates causality. And just as Einstein’s equations (20.16) enforce conservation of energy-momentum, so also Maxwell’s equations (20.40) enforce conservation of electric charge ∂q +∇·j =0 , ∂t

(20.42)

∇m j m = 0 .

(20.43)

or in 4-dimensional form

The fact that only the vector part A⊥ satisfies a wave equation (20.41) reflects physically the fact that electromagnetic waves are transverse, and they contain only two propagating degrees of freedom, the vector, or spin ±1, components. Why do Maxwell’s equations (20.40) have this structure? Although equation (20.41) appears to be a local wave equation for the vector part A⊥ of the potential sourced by the vector part j⊥ of the current, in fact the wave equation is non-local because the decomposition of the potential and current into scalar and vector parts is non-local (it involves the solution of a Laplacian equation, eq. (19.3)). It is only the sum j = ∇jk +j⊥ of the scalar and vector parts of the current density that is local. Therefore, the Maxwell’s equation (20.40b) must have a scalar part to go along with the vector part, such that the source on the right hand side, the current density j, is local. Given this Maxwell equation (20.40b), the Maxwell equation (20.40a) then serves precisely to enforce conservation of electric charge, equation (20.42). Just as it is possible to regard the Einstein equations (20.16a) and (20.16b) as constraint equations whose continued satisfaction is guaranteed by conservation of energy-momentum, so also the Maxwell equation (20.40a) for Φ can be regarded as a constraint equation whose continued satisfaction is guaranteed by conservation of electric charge. For charge conservation (20.42) coupled with the spatial Maxwell equation (20.40b) ensures that  ∂ 4πq + ∇2 Φ = 0 , (20.44) ∂t the solution of which, subject to the condition that 4πq + ∇2 Φ = 0 initially, is 4πq + ∇2 Φ = 0 at all times, which is precisely the Maxwell equation (20.40a).

Exercise 20.4 Is it possible to discard the scalar part of the spatial Maxwell equation (20.40b), rather than equation (20.40a) for Φ? Project out the scalar part of equation (20.40b) by taking its divergence,   ˙ =0. (20.45) ∇2 4πjk − Φ

336

Flat space background

Argue that the Maxwell equation (20.40a), coupled with charge conservation (20.42), ensures that equation (20.45) is true, subject to boundary condition that the current j vanish sufficiently rapidly at spatial infinity, in accordance with the decomposition theorem of §19.1. Since only gauge-invariant quantities have physical significance, it is legitimate to impose any condition on the gauge field χ. A gauge in which the potentials φ and A individually satisfy wave equations is Lorenz (not Lorentz!) gauge, which consists of the Lorentz-invariant condition ∇m Am = 0 .

(20.46)

Under a gauge transformation (20.35), the left hand side of equation (20.46) transforms as ∇m Am → ∇m Am + χ ,

(20.47)

and the Lorenz gauge condition (20.46) can be accomplished as a particular solution of the wave equation for the gauge field χ. In terms of the potentials φ and Ak , the Lorenz gauge condition (20.46) is ∂φ + ∇2 Ak = 0 . ∂t

(20.48)

In Lorenz gauge, Maxwell’s equations (20.40) become φ = 4πq ,

(20.49a)

A = 4πj ,

(20.49b)

which are manifestly wave equations for the potentials φ and A. Does the fact that the potentials φ and A in one particular gauge, Lorenz gauge, satisfy wave equations necessarily guarantee that the electric and magnetic fields E and B satisfy wave equations? Yes, because it follows from the definitions (20.31) of E and B that if the potentials φ and A satisfy wave equations, then so also must the fields E and B themselves; but the fields E and B are gauge-invariant, so if they satisfy wave equations in one gauge, then they must satisfy the same wave equations in any gauge. In electromagnetism, the most physical choice of gauge is one in which the potentials φ and A coincide with the gauge-invariant potentials Φ and A⊥ , equations (20.38). This gauge, known as Coulomb gauge, is accomplished by setting Ak = 0 ,

(20.50)

∇·A= 0 .

(20.51)

or equivalently

The gravitational analogue of this gauge is the Newtonian gauge discussed in the next section but one, §20.9. Does the fact that in Lorenz gauge the potentials φ and A propagate at the speed of light (in the absence of sources, j m = 0) imply that the gauge-invariant potentials Φ and A⊥ propagate at the speed of light? No. The gauge-invariant potentials Φ and A⊥ , equations (20.38), are related to the Lorenz gauge potentials φ and A by a non-local decomposition.

20.7 Harmonic gauge

337

20.7 Harmonic gauge The fact that all locally measurable gravitational perturbations do propagate causally, at the speed of light in the absence of sources, can be demonstrated by choosing a particular gauge, harmonic gauge, equation (20.52), which can be considered an analogue of the Lorenz gauge of electromagnetism, equation (20.46). In harmonic gauge, all 10 of the tetrad gauge-variant (i.e. symmetric) combinations ϕmn +ϕnm of the vierbein perturbations satisfy wave equations (20.56), and therefore propagate causally. This does not imply that the scalar, vector, and tensor components of the vierbein perturbations individually propagate causally, because the decomposition into scalar, vector, and tensor modes is non-local. In particular, the coordinate and tetrad-gauge invariant potentials Ψ, Φ, Wi , and hij defined by equations (20.13) do not propagate causally. The situation is entirely analogous to that of electromagnetism, §20.6, where in Lorenz gauge the potentials φ and A propagate causally, equations (20.49), yet the gauge-invariant potentials Φ and A⊥ defined by equations (20.38) do not. Harmonic gauge is the set of 4 coordinate conditions ∇m (ϕmn + ϕnm ) − ∇n ϕm m = 0 .

(20.52)

The conditions (20.52) are arranged in a form that is tetrad gauge-invariant (the conditions depend only on the symmetric part of ϕmn ). The quantities on the left hand side of equations (20.52) transform under a coordinate gauge transformation, in accordance with (20.9), as ∇m (ϕmn + ϕnm ) − ∇n ϕm m → ∇m (ϕmn + ϕnm ) − ∇n ϕm m + ǫn .

(20.53)

The change ǫn resulting from the coordinate gauge transformation is the 4-dimensional wave operator  acting on the coordinate shift ǫn . Indeed, the harmonic gauge conditions (20.52) follow uniquely from the requirements (a) that the change produced by a coordinate gauge transformation be ǫn , as suggested by the analogous electromagnetic transformation (20.47), and (b) that the conditions be tetrad gauge-invariant. The harmonic gauge conditions (20.52) can be accomplished as a particular solution of the wave equation for the coordinate shift ǫn . In terms of the potentials defined by equations (20.6) and (20.13), the 4 harmonic gauge conditions (20.52) are ˙ =0, ˙ + 3Φ ˙ + (w + w Ψ ˜ − h) ˜ i) = 0 , ˙ i + (hi + h W −Ψ + Φ + h = 0 ,

(20.54a) (20.54b) (20.54c)

or equivalently ˙ = (w + w) −4Φ ˜ , ˜ ˙ − Wi = (hi + hi ) ,

Ψ − Φ = h .

(20.55a) (20.55b) (20.55c)

Substituting equations (20.55) into the Einstein tensor Gmn leads, after some calculation, to the result that

338

Flat space background

in harmonic gauge − 21  (ϕmn + ϕnm ) = Rmn ,

(20.56)

where Rmn = Gmn − 12 ηmn G is the Ricci tensor. Equation (20.56) shows that in harmonic gauge, all tetrad gauge-invariant (i.e. symmetric) combinations ϕmn +ϕnm of the vierbein potentials propagate causally, at the speed of light in vacuo, Rmn = 0. Although the result (20.56) is true only in a particular gauge, harmonic gauge, it follows that all quantities that are (coordinate and tetrad) gauge-invariant, and that can be constructed from the vierbein potentials ϕmn and their derivatives (and are therefore local), must also propagate at the speed of light.

20.8 What is the gravitational field? In electromagnetism, the electromagnetic fields are the electric field E and the magnetic field B. These fields have the property that they are gauge-invariant, and measurable locally. The electromagnetic potentials Φ and A⊥ , equations (20.38), are gauge-invariant, but they are not measurable locally. What are the analogous gauge-invariant and locally measurable quantities for the gravitational field in perturbed Minkowski space? The answer is, the Weyl tensor Cklmn , the trace-free or tidal part of the Riemann tensor, the expression (20.20) for which depends only on the coordinate and tetrad gauge-invariant potentials.

20.9 Newtonian (Copernican) gauge If the unperturbed background is Minkowski space, then the most physical gauge is one in which the 6 perturbations retained coincide with the 6 coordinate and tetrad gauge-invariant perturbations (20.13). This gauge is called Newtonian gauge. Because in Newtonian gauge the perturbations are precisely the physical perturbations, if the perturbations are physically weak (small), then the perturbations in Newtonian gauge will necessarily be small. I think Newtonian gauge should be called Copernican gauge. Even though the solar system is a highly nonlinear system, from the perspective of general relativity it is a weakly perturbed gravitating system. Applied to the solar system, Newtonian gauge effectively keeps the coordinates aligned with the classical Sun-centred Copernican coordinate frame. By contrast, the coordinates of synchronous gauge (§20.10), which are chosen to follow freely-falling bodies, would quickly collapse or get wound up by orbital motions if applied to the solar system, and would cease to provide a useful description. Newtonian gauge sets ˜ = hi = h ˜i = 0 , w=w ˜=w ˜i = h = h

(20.57)

20.10 Synchronous gauge

339

so that the retained perturbations are the 6 coordinate and tetrad gauge-invariant perturbations (20.13) Ψ

scalar

= ψ,

Φ ,

(20.58a) (20.58b)

scalar

Wi vector

= wi ,

(20.58c)

hij tensor

.

(20.58d)

The Newtonian line-element is, in a form that keeps the Newtonian tetrad manifest,     2 ds2 = − (1 + Ψ) dt + δij (1 − Φ)dxi − hik dxk − W i dt (1 − Φ)dxj − hjl dxl − W j dt ,

(20.59)

which reduces to the Newtonian metric

  ds2 = − (1 + 2 Ψ) dt2 − 2 Wi dt dxi + δij (1 − 2 Φ) − 2 hij dxi dxj .

(20.60)

uµ = et µ = {1 − Ψ, Wi } .

(20.61)

Since scalar, vector, and tensor perturbations evolve independently, it is legitimate to consider each in isolation. For example, if one is interested only in scalar perturbations, then it is fine to keep only the scalar potentials Ψ and Φ non-zero. Furthermore, as discussed in §20.13, since the difference Ψ − Φ in scalar potentials is sourced by anisotropic relativistic pressure, which is typically small, it is often a good approximation to set Ψ = Φ. The tetrad-frame 4-velocity of a person at rest in the tetrad frame is by definition um = {1, 0, 0, 0}, and the corresponding coordinate 4-velocity uµ is, in Newtonian gauge,

This shows that Wi can be interpreted as a 3-velocity at which the tetrad frame is moving through the coordinates. This is the “dragging of inertial frames” discussed in §20.12. The proper acceleration experienced by a person at rest in the tetrad frame, with tetrad 4-velocity um = {1, 0, 0, 0}, is

 Dui (20.62) = ut Dt ui = ut ∂t ui + Γitt ut = Γitt = ∇i Ψ . Dτ This shows that the “gravity,” or minus the proper acceleration, experienced by a person at rest in the tetrad frame is minus the gradient of the potential Ψ. Concept question 20.5 If the decomposition into scalar, vector, and tensor modes is non-local, how can it be legimate to consider the evolution of the modes in isolation from each other?

20.10 Synchronous gauge One of the earliest gauges used in general relativistic perturbation theory, and still (in its conformal version) widely used in cosmology, is synchronous gauge. As will be seen below, equations (20.69) and (20.70),

340

Flat space background

synchronous gauge effectively chooses a coordinate system and tetrad that is attached to the locally inertial frames of freely falling observers. This is fine as long as the observers move only slightly from their initial positions, but the coordinate system will fail when the system evolves too far, even if, as in the solar system, the gravitational perturbations remain weak and therefore treatable in principle with perturbation theory. Synchronous gauge sets the time components ϕmn with m = t or n = t of the vierbein perturbations to zero ψ=w=w ˜ = wi = w ˜i = 0 ,

(20.63)

and makes the additional tetrad gauge choices ˜h = h ˜i = 0 ,

(20.64)

with the result that the retained perturbations are the spatial perturbations Φ ,

scalar

h

scalar

,

hi vector

,

hij tensor

.

(20.65)

In terms of these spatial perturbations, the gauge-invariant perturbations (20.13) are Ψ

scalar

¨ , = h

Φ ,

(20.66a) (20.66b)

scalar

Wi vector

= − h˙ i ,

(20.66c)

hij tensor

.

(20.66d)

The synchronous line-element is, in a form that keeps the synchronous tetrad manifest,    ds2 = − dt2 + δij (1 − Φ)dxi − (∇k ∇i h + ∇k hi + hik )dxk (1 − Φ)dxj − (∇l ∇j h + ∇l hj + hjl )dxl , (20.67) which reduces to the synchronous metric

ds2 = − dt2 + [(1 − 2 Φ)δij − 2 ∇i ∇j h − ∇i hj − ∇j hi − 2 hij ] dxi dxj .

(20.68)

In synchronous gauge, a person at rest in the tetrad frame has coordinate 4-velocity uµ = et µ = {1, 0, 0, 0} ,

(20.69)

so that the tetrad rest frame coincides with the coordinate rest frame. Moreover a person at rest in the tetrad frame is freely falling, which follows from the fact that the acceleration experienced by a person at rest in the tetrad frame is zero  Duk (20.70) = ut ∂t uk + Γktt ut = Γktt = 0 , Dτ

in which ∂t uk = 0 because the 4-velocity at rest in the tetrad frame is constant, uk = {1, 0, 0, 0}, and Γktt = 0 from equations (20.15a) with the synchronous gauge choices (20.63) and (20.64).

20.11 Newtonian potential

341

20.11 Newtonian potential The next few sections examine the physical meaning of each of the gauge-invariant potentials Ψ, Φ, Wi , and hij by looking at the potentials at large distances produced by a finite body containing energy-momentum, such as the Sun. Einstein’s equations Gmn = 8πTmn applied to the time-time component Gtt of the Einstein tensor, equation (20.16a), imply Poisson’s equation ∇2 Φ = 4πρ ,

(20.71)

ρ ≡ Ttt .

(20.72)

where ρ is the mass-energy density

The solution of Poisson’s equation (20.71) is Φ(x) = −

Z

ρ(x′ ) d3 x′ . |x′ − x|

(20.73)

Consider a finite body, for example the Sun, whose energy-momentum is confined within a certain region. Define the mass M of the body to be the integral of the mass-energy density ρ, Z M ≡ ρ(x′ ) d3 x′ . (20.74) Equation (20.74) agrees with what the definition of the mass M would be in the non-relativistic limit, and as seen below, equation (20.77), it is what a distant observer would infer the mass of the body to be based on its gravitational potential Φ far away. Thus equation (20.74) can be taken as the definition of the mass of the body even when the energy-momentum is relativistic. Choose the origin of the coordinates to be at the centre of mass, meaning that Z x′ ρ(x′ ) d3 x′ = 0 . (20.75) Consider the potential Φ at a point x far outside the body. Expand the denominator of the integral on the right hand side of equation (20.73) as a Taylor series in 1/x where x ≡ |x| ℓ ∞  1 x 1 X x′ ˆ · x′ 1 ′ P (ˆ x · x ˆ ) = + ... (20.76) = + ℓ |x′ − x| x x x x2 ℓ=0

where Pℓ (µ) are Legendre polynomials. Then Z Z 1 1 Φ(x) = − ˆ · x′ ρ(x′ ) d3 x′ − O(x−3 ) ρ(x′ ) d3 x′ − 2 x x x M − O(x−3 ) . =− x

(20.77)

Equation (20.77) shows that the potential far from a body goes as Φ = −M/x, reproducing the usual Newtonian formula.

342

Flat space background

20.12 Dragging of inertial frames In Newtonian gauge, the vector potential W ≡ Wi is the velocity at which the locally inertial tetrad frame moves through the coordinates, equation (20.61). This is called the dragging of inertial frames. As shown below, a body of angular momentum L drags frames around it with an angular velocity that goes to 2L/x3 at large distances x. Einstein’s equations applied to the vector part of the time-space component Gti of the Einstein tensor, equation (20.16b), imply ∇2 W = − 16πf ,

(20.78)

where W ≡ Wi is the gauge-invariant vector potenial, and f is the vector part of the energy flux T ti f ≡ fi = f i ≡ T ti = − Tti . vector

vector

(20.79)

The solution of equation (20.78) is W (x) = 4

Z

f (x′ ) d3 x′ . |x′ − x|

(20.80)

As in the previous section, §20.11, consider a finite body, such as the Sun, whose energy-momentum is confined within a certain region. Work in the rest frame of the body, defined to be the frame where the energy flux f integrated over the body is zero, Z f (x′ ) d3 x′ = 0 . (20.81) Define the angular momentum L of the body to be Z L ≡ x′ × f (x′ ) d3 x′ .

(20.82)

Equation (20.82) agrees with what the definition of angular momentum would be in the non-relativistic limit, where the mass-energy flux of a mass density ρ moving at velocity v is f = ρ v. As will be seen below, the angular momentum (20.82) is what a distant observer would infer the angular momentum of the body to be based on the potential W far away, and equation (20.82) can be taken to be the definition of the angular momentum of the body even Rwhen the energy-momentum is relativistic. As will be proven momentarily, equation (20.83), the integral x′i fj (x′ ) d3 x′ is antisymmetric in ij. To show this, write fj = εjkl ∇k φl for some potential φl , which is valid because fj is the vector (curl) part of the energy flux. Then Z Z Z Z ′ ′ 3 ′ ′ ′ ′ 3 ′ ′ ′ ′ 3 ′ xi fj (x ) d x = xi εjkl ∇k φl (x ) d x = − εjkl φl (x )∇k xi d x = εijl φl (x′ ) d3 x′ , (20.83) where the third expression follows from the second by integration by parts, the surface term vanishing because of the assumption that the energy-momentum of the body is confined within a certain region.

20.13 Quadrupole pressure

343

Taylor expanding equation (20.80) using equation (20.76) gives Z Z 4 4 W (x) = f (x′ ) d3 x′ + 2 (ˆ x · x′ )f (x′ ) d3 x + O(x−3 ) x x Z 2 = 2 [(ˆ x · x′ )f (x′ ) − (ˆ x · f (x′ ))x′ ] d3 x + O(x−3 ) x 2 ˆ + O(x−3 ) , (20.84) = 2 L×x x where the first integral on the right hand side of the first line of equation (20.84) vanishes because the frame is the rest frame of the body, equation (20.81), and the second integral on the R right hand side of the first line equals the first integral on the second line thanks to the antisymmetry of x′ f (x′ ) d3 x, equation (20.83). The vector potential W ≡ Wi points in the direction of rotation, right-handedly about the axis of angular momentum L. Equation (20.84) says that a body of angular momemtum L drags frames around it at angular velocity Ω at large distances x 2L (20.85) W = Ω×x , Ω= 3 . x

20.13 Quadrupole pressure Einstein’s equations applied to the part of the Einstein tensor (20.16c) involving Ψ − Φ imply ∇2 (Ψ − Φ) = − 8πQij Tij ,

(20.86)

where Qij is the quadrupole operator (an integro-differential operator) defined by Qij ≡

3 2

∇i ∇j ∇−2 −

1 2

δij ,

(20.87)

with ∇−2 the inverse spatial Laplacian operator. In Fourier space, the quadrupole operator is Qij =

3 2

kˆi kˆj −

1 2

δij .

(20.88)

The quadrupole operator Qij yields zero when acting on δij , and the Laplacian operator ∇2 when acting on ∇i ∇j Qij ∇i ∇j = ∇2 .

(20.89)

 Tij (x′ ) d3 x′ 3 (xi − x′i )(xj − x′j ) 1 − δ . ij 2 |x − x′ |2 2 |x − x′ |

(20.90)

Qij δij = 0 ,

The solution of equation (20.86) is Ψ−Φ=−

Z 

At large distance in the z-direction from a finite body Z  1  Ψ−Φ=− Tzz − 12 (Txx + Tyy ) d3 x′ + O(x−2 ) . x

(20.91)

Equation (20.86) shows that the source of the difference Ψ − Φ between the two scalar potentials is the

344

Flat space background

quadrupole pressure. Since the quadrupole pressure is small if either there are no relativistic sources, or any relativistic sources are isotropic, it is often a good approximation to set Ψ = Φ. An exception is where there is a significant anisotropic relativistic component. For example, the energy-momentum tensor of a static electric field is relativistic and anisotropic. However, this is still not enough to ensure that Ψ differs from Φ: as found in Exercise 20.6, if the energy-momentum of a body is spherically symmetric, then Ψ − Φ vanishes outside (but not inside) the body. One situation where the difference between Ψ and Φ is appreciable is the case of freely-streaming neutrinos at around the time of recombination in cosmology. The 2008 analysis of the CMB by the WMAP team claims to detect a non-zero value of Ψ − Φ from a slight shift in the third acoustic peak. Exercise 20.6 Argue that the traceless part of the energy-momentum tensor of a spherically symmetric distribution must take the form   (20.92) Tij (r) = rˆi rˆj − 31 δij p(r) − p⊥ (r) ,

where p(r) and p⊥ (r) are the radial and transverse pressures at radius r. From equation (20.90), show that Ψ − Φ at radial distance x from the centre of a spherically symmetric distribution is Z ∞  4πdr Ψ(x) − Φ(x) = − (r2 − x2 ) p(r) − p⊥ (r) . (20.93) r x

Notice that the integral is over r > x, that is, only energy-momentum outside radius x produces nonvanishing Ψ − Φ. In particular, if the body has finite extent, then Ψ − Φ vanishes outside the body.

20.14 Gravitational waves The tensor perturbations hij describe propagating gravitational waves. The two independent components of the tensor perturbations describe two polarizations. The two components are commonly designated h+ and h× , equations (20.19). Gravitational waves induce a quadrupole tidal oscillation transverse to the direction of propagation, and the subscripts + and × represent the shape of the quadrupole oscillation, as illustrated by Figure 20.1. The h+ polarization has a cos 2χ shape, while the h× polarization has a sin 2χ shape, where χ is the azimuthal angle with respect to the y-axis about the direction x of propagation. Einstein’s equations applied to the tensor component of the spatial Einstein tensor (20.16c) imply that gravitational waves are sourced by the tensor component of the energy-momentum hij = − 8π Tij .

(20.94)

tensor

The solution of the wave equation (20.94) can be obtained from the Green’s function of the d’Alembertian wave operator . The Green’s function is by definition the solution of the wave equation with a deltafunction source. There are retarded solutions, which propagate into the future along the future light cone,

20.14 Gravitational waves

345

Figure 20.1 The two polarizations of gravitational waves. The (top) polarization h+ has a cos 2χ shape about the direction of propagation (into the paper), while the (bottom) polarization h× has a sin 2χ shape. A gravitational wave causes a system of freely falling test masses to oscillate relative to a grid of points a fixed proper distance apart.

and advanced solutions, which propagate into the past along the past light cone. In the present case, the solutions of interest are the retarded solutions, since these represent gravitational waves emitted by a source. Because of the time and space translation symmetry of the d’Alembertian, the delta-function source of the Green’s function can without loss of generality be taken at the origin t = x = 0. Thus the Green’s function F is the solution of F = δ 4 (x) ,

(20.95)

where δ 4 (x) ≡ δ(t)δ 3 (x) is the 4-dimensional Dirac delta-function. The solution of equation (20.95) subject to retarded boundary conditions is (a standard exercise in mathematics) the retarded Green’s function F =

δ(x − t)Θ(t) , 4πx

(20.96)

where x ≡ |x| and Θ(t) is the Heaviside function, Θ(t) = 0 for t < 0 and Θ(t) = 1 for t ≥ 0. The solution of the sourced gravitational wave equation (20.94) is thus hij (t, x) = − 2

Z Tij (t′ , x′ ) d3 x′ tensor |x′ −

x|

,

(20.97)

346

Flat space background

where t′ is the retarded time t′ ≡ t − |x′ − x| ,

(20.98)

which lies on the past light cone of the observer, and is the time at which the source emitted the signal. The solution (20.97) resembles the solution of Poisson’s equation, except that the source is evaluated along the past light cone of the observer. As in §§20.11 and 20.12, consider a finite body, whose energy-momentum is confined within a certain region, and which is a source of gravitational waves. The Hulse-Taylor binary pulsar is a fine example. Far from the body, the leading order contribution to the tensor potential hij is, from the multipole expansion (20.76), Z 2 hij (t, x) = − Tij (t′ , x′ ) d3 x′ . (20.99) x tensor The integral (20.99) is hard to solve in general, but there is a simple solution for gravitational waves whose wavelengths are large compared to the size of the body. To obtain this solution, first consider that conservation of energy-momentum implies that  ti    ∂T ∂ ∂T tt ∂ 2 T tt ji ti ji − ∇ =0. (20.100) − ∇ ∇ T = + ∇ T + ∇ T i i j i j ∂t2 ∂t ∂t ∂t Multiply by xi xj and integrate Z Z Z Z 2 tt 3 i j kl 3 kl i j 3 i j ∂ T d x = x x ∇k ∇l T d x = T ∇k ∇l (x x ) d x = 2 T ij d3 x , xx ∂t2

(20.101)

where the third expression follows from the second by a double integration by parts. For wavelengths that are long compared to the size of the body, the first expression of equations (20.101) is Z Z ∂ 2 T tt 3 ∂2 ∂ 2 Iij xi xj d x ≈ (20.102) xi xj T tt d3 x = 2 2 ∂t ∂t ∂t2 where Iij is the second moment of the mass

Iij ≡

Z

xi xj T tt d3 x .

(20.103)

The tensor (spin-2) part of the energy-momentum is trace-free. The trace-free part –Iij of the second moment Iij is the quadrupole moment of the mass distribution (this definition is conventional, but differs by a factor of 2/3 from what is called the quadrupole moment in spherical harmonics) Z – Iij ≡ Iij − 13 δij Ikk = (xi xj − 13 δij x2 ) T tt d3 x . (20.104) Substituting the last expression of equations (20.101) into equation (20.99) gives the quadrupole formula for gravitational radiation at wavelengths long compared to the size of the emitting body hij (t, x) = −

1 ¨ –I ij . x tensor

(20.105)

20.15 Energy-momentum carried by gravitational waves

347

If the gravitational wave is moving in the z-direction, then the tensor components of the quadrupole moment I–ij are – I+ = 12 (Ixx − Iyy ) ,

–I× = 12 (Ixy + Iyx ) .

(20.106)

20.15 Energy-momentum carried by gravitational waves The gravitational wave equation (20.28) in empty space appears to describe gravitational waves propagating in a region where the energy-momentum tensor Tmn is zero. However, gravitational waves do carry energymomentum, just as do other kinds of waves, such as electromagnetic waves. The energy-momentum is quadratic in the tensor perturbation hij , and so vanishes to linear order. To determine the energy-momentum in gravitational waves, calculate the Einstein tensor Gmn to second order, imposing the vacuum conditions that the unperturbed and linear parts of the Einstein tensor vanish 0

1

Gmn = Gmn = 0 .

(20.107)

The parts of the second-order perturbation that depend on the tensor perturbation hij are, in a frame where the wavevector k is along the z-axis, 2

 1  ∂2 + ∇2z h2 , 2 4 ∂t 1 ∂ = − (h˙ ij )(∇z hij ) + ∇z h2 , 2 ∂t  1  ∂2 2 2 + ∇ = − (∇z hij )(∇z hij ) + z h , 4 ∂t2

Gtt = − (h˙ ij )(h˙ ij ) + 2

Gtz 2

Gzz

(20.108a) (20.108b) (20.108c)

where h2 ≡ hij hij = 2(h2+ + h2× ) = 2h++ h−− .

(20.109)

Being tetrad-frame quantities, the expressions (20.108) are automatically coordinate gauge-invariant, and they are also tetrad gauge-invariant since they depend only on the (coordinate and) tetrad gauge-invariant perturbation hij . The rightmost set of terms on the right hand side of each of equations (20.108) are total derivatives (with respect to time t or space z). These terms yield surface terms when integrated over a region, and tend to average to zero when integrated over a region much larger than a wavelength. On the other hand, the leftmost set of terms on the right hand side of each of equations (20.108) do not average to zero; for example, the terms for Gtt and Gzz are negative everywhere, being minus a sum of squares. A negative energy density? The interpretation is that these terms are to be taken over to the right hand side

348

Flat space background

gw of the Einstein equations, and re-interpreted as the energy-momentum Tmn in gravitational waves    1 1  ∂2 Tttgw ≡ (h˙ ij )(h˙ ij ) − (20.110a) + ∇2z h2 , 2 8π 4 ∂t   1 ∂ 1 gw (h˙ ij )(∇z hij ) − (20.110b) ∇z h2 , Ttz ≡ 8π 2 ∂t    1 1  ∂2 gw 2 2 Tzz ≡ (∇z hij )(∇z hij ) − . (20.110c) + ∇ z h 8π 4 ∂t2

The terms involving total derivatives, although they vanish when averaged over a region larger than many gw wavelengths, ensure that the energy-momentum Tmn in gravitational waves satisfies conservation of energymomentum in the flat background space gw ∇m Tmn =0.

(20.111)

Averaged over a region larger than many wavelengths, the energy-momentum in gravitational waves is gw hTmn i=

1 (∇m hij )(∇n hij ) . 8π

(20.112)

Equation (20.112) may also be written explicitly as a sum over the two linear or circular polarizations 1 [(∇m h+ )(∇n h+ ) + (∇m h× )(∇n h× )] 4π 1 [(∇m h++ )(∇n h−− ) + (∇n h++ )(∇m h−− )] . = 8π

gw hTmn i=

(20.113)

PART EIGHT COSMOLOGICAL PERTURBATIONS

Concept Questions

1. Why do the wavelengths of perturbations in cosmology expand with the Universe, whereas perturbations in Minkowski space do not expand? 2. What does power spectrum mean? 3. Why is the power spectrum a good way to characterize the amplitude of fluctuations? 4. Why is the power spectrum of fluctuations of the Cosmic Microwave Background (CMB) plotted as a function of harmonic number? 5. What causes the acoustic peaks in the power spectrum of fluctuations of the CMB? 6. Are there acoustic peaks in the power spectrum of matter (galaxies) today? 7. What sets the scale of the first peak in the power spectrum of the CMB? [What sets the physical scale? Then what sets the angular scale?] 8. The odd peaks (including the first peak) in the CMB power spectrum are compression peaks, while the even peaks are rarefaction peaks. Why does a rarefaction produce a peak, not a trough? 9. Why is the first peak the most prominent? Why do higher peaks generally get progressively weaker? 10. The third peak is about as strong as the second peak? Why? 11. The matter power spectrum reaches a maximum at a scale that is slightly larger than the scale of the first baryonic acoustic peak. Why? 12. The physical density of species x at the time of recombination is proportional to Ωx h2 where Ωx is the ratio of the actual to critical density of species x at the present time, and h ≡ H0 /100 km s−1 Mpc−1 is the present-day Hubble constant. Explain. 13. How does changing the baryon density Ωb h2 affect the CMB power spectrum? 14. How does changing the non-baryonic cold dark matter density Ωc h2 , without changing the baryon density Ωb h2 , affect the CMB power spectrum? 15. What effects do neutrinos have on perturbations? 16. How does changing the curvature Ωk affect the CMB power spectrum? 17. How does changing the dark energy ΩΛ affect the CMB power spectrum?

21 An overview of cosmological perturbations

Undoubtedly the preeminent application of general relativistic perturbation theory is to cosmology. Fluctations in the temperature and polarization of the Cosmic Microwave Background (CMB) provide an observational window on the Universe at 400,000 years old that, coupled with other astronomical observations, has yielded impressively precise measurements of cosmological parameters. The theory of cosmological perturbations is based principally on general relativistic perturbation theory coupled to the physics of 5 species of energy-momentum: photons, baryons, non-baryonic cold dark matter, neutrinos, and dark energy. Dark energy was not important at the time of recombination, where the CMB that we see comes from, but it is important today. If dark energy has a vacuum equation of state, p = −ρ, then dark energy does not cluster (vacuum energy density is a constant), but it affects the evolution of the cosmic scale factor, and thereby does affect the clustering of baryons and dark matter today. Moreover the evolution of the gravitational potential along the line-of-sight to the CMB does affect the observed power spectrum of the CMB, the so-called integrated Sachs-Wolfe effect. Unfortunately, it is beyond the scope of these notes to treat cosmological perturbations in full. For that, consult Scott Dodelson’s incomparable text “Modern Cosmology”. 1. Inflationary initial conditions. The theory of inflation has been remarkably successful in accounting for many aspects of observational cosmology, even though a fundamental understanding of the inflaton scalar field that supposedly drove inflation is missing. The current paradigm holds that primordial fluctuations were generated by vacuum quantum fluctuations in the inflaton field at the time of inflation. The theory makes the generic predictions that the gravitational potentials generated by vacuum fluctuations were (a) Gaussian, (b) adiabatic (meaning that all species of mass-energy fluctuated together, as opposed to in opposition to each other), and (c) scale-free, or rather almost scale-free (the fact that inflation came to an end modifies slightly the scale-free character). The three predictions fit the observed power spectrum of the CMB astonishingly well. 2. Comoving Fourier modes. The spatial homogeneity of the Friedmann-Robertson-Walker background spacetime means that its perturbations are characterized by Fourier modes of constant comoving wavevec-

An overview of cosmological perturbations

353

tor. Each Fourier mode generated by inflation evolved independently, and its wavelength expanded with the Universe. 3. Scalar, vector, tensor modes. Spatial isotropy on top of spatial homogenity means that the perturbations comprised independently evolving scalar, vector, and tensor modes. Scalar modes dominate the fluctuations of the CMB, and caused the clustering of matter today. Vector modes are usually assumed to vanish, because there is no mechanism to generate the vorticity that sources vector modes, and the expansion of the Universe tends to redshift away any vector modes that might have been present. Inflation generates gravitational waves, which then propagate essentially freely to the present time. Gravitational waves leave an observational imprint in the “B” (curl) mode of polarization of the CMB, whereas scalar modes produce only an “E” (gradient) mode of polarization. 4. Power spectrum. The primary quantity measurable from observations is the power spectrum, which is the variance of fluctuations of the CMB or of matter (as traced by galaxies, galaxy clusters, the Lyman alpha forest, peculiar velocities, weak lensing, or 21 centimeter observations at high redshift). The statistics of a Gaussian field are completely characterized by its mean and variance. The mean characterizes the unperturbed background, while the variance characterizes the fluctuations. For a 3-dimensional statistically homogeneous and isotropic field, the variance of Fourier modes δk defines the power spectrum P (k) hδk δk′ i = 1kk′ P (k) ,

(21.1)

where 1kk′ is the unit matrix in the Hilbert space of Fourier modes 3 1kk′ ≡ (2π)3 δD (k + k′ ) .

(21.2)

The “momentum-conserving” Dirac delta-function in equation (21.2) is a consequence of spatial translation symmetry. Isotropy implies that the power spectrum P (k) is a function only of the absolute value k ≡ |k| of the wavevector. For a statistically rotation invariant field projected on the sky, such as the CMB, the variance of spherical harmonic modes Θℓm ≡ δTℓm /T defines the power spectrum Cℓ hΘℓm Θℓ′ m′ i = 1ℓm,ℓ′ m′ Cℓ

(21.3)

where 1ℓm,ℓ′ m′ is the unit matrix in the Hilbert space of spherical harmonics (distinguish the three usages of δ in this paragraph: δ meaning fluctuation, δD meaning Dirac delta-function, and δ meaning Kronecker delta, as in the following equation) 1ℓm,ℓ′ m′ ≡ δℓℓ′ δm,−m′ .

(21.4)

Again, the “angular momentum-preserving” condition (21.4) that ℓ = ℓ′ and m + m′ = 0 is a consequence of rotational symmetry. The same rotational symmetry implies that the power spectrum Cℓ is a function only of the harmonic number ℓ, not of the directional harmonic number m. 5. Reheating. Early Universe inflation evidently came to an end. It is presumed that the vacuum energy released by the decay of the inflaton field, an event called reheating, somehow efficiently produced the matter and radiation fields that we see today. After reheating, the Universe was dominated by relativistic

354

An overview of cosmological perturbations

fields, collectively called “radiation”. Reheating changed the evolution of the cosmic scale factor from acceleration to deceleration, but is presumed not to have generated additional fluctuations. 6. Photon-baryon fluid and the sound horizon. Photon-electron (Thomson) scattering kept photons and baryons tightly coupled to each other, so that they behaved like a relativistic fluid. As long as the radiation density exceeded the baryon density, which remained q true up to near the time of recombination, p the speed of sound in the photon-baryon fluid was p/ρ ≈ 31 of the speed of light. Fluctuations with wavelengths outside the sound horizon grew by gravity. As time went by, the sound horizon expanded in comoving radius, and fluctuations thereby came inside the sound horizon. Once inside the sound horizon, sound waves could propagate, which tended to decrease the gravitational potential. However, each individual sound wave itself continued to oscillate, its oscillation amplitude δT /T relative to the background temperature T remaining approximately constant. The relativistic suppression of the potential at small scales is responsible for the fact that the power spectrum of matter declines at small scales. 7. Acoustic peaks in the power spectrum. The oscillations of the photon-baryon fluid produced the characteristic pattern of peaks and troughs in the CMB power spectrum observed today. The same peaks and troughs occur in the matter power spectrum, but are much less prominent, at a level of about 10% as opposed to the order unity oscillations observed in the CMB power spectrum. For adiabatic fluctuations, the amplitude of the temperature fluctuations follows a pattern ∼ − cos(kηs ) where ηs is the comoving sound horizon. The n’th peak occurs at a wavenumber k where kηs ≈ nπ. In the observed CMB power spectrum, the relevant value of the sound horizon ηs is its value ηs,∗ at recombination. Thus the wavenumber k of the first peak of the observed CMB power spectrum occurs where kηs,∗ ≈ π. Two competing forces cause a mode to evolve: a gravitational force that amplifies compression, and a restoring pressure force that counteracts compression. When a mode enters the sound horizon for the first time, the compressing gravitational force beats the restoring pressure force, so the first thing that happens is that the mode compresses further. Consequently the first peak is a compression peak. This sets the subsequent pattern: odd peaks are compression peaks, while even peaks are rarefaction peaks. The observed temperature fluctuations of the CMB are produced by a combination of intrinsic temperature fluctuations, Doppler shifts, and gravitational redshifting out of potential wells. The Doppler shift produced by the velocity of a perturbation is 90◦ out of phase with the temperature fluctuation, and so tends to fill in the troughs in the power spectrum of the temperature fluctuation. This is the main reason that the observed CMB power spectrum remains above zero at all scales. 8. Logarithmic growth of matter fluctuations. Non-baryonic cold dark matter interacts weakly except by gravity, and is needed to explain the observed clustering of matter in the Universe today in spite of the small amplitude of temperature fluctuations in the CMB. The adjective “cold” refers to the requirement that the dark matter became non-relativistic (p = 0) at some early time. If the dark matter is both non-baryonic and cold, then it did not participate in the oscillations of the photon-baryon fluid. During the radiation-dominated phase prior to matter-radiation equality, dark matter matter fluctuations inside the sound horizon grow logarithmically. The logarithmic growth translates into a logarithmic increase in the amplitude of matter fluctuations at small scales, and is a characteristic signature of non-baryonic

An overview of cosmological perturbations

355

cold dark matter. Unfortunately this signature is not readily discernible in the power spectrum of matter today, because of nonlinear clustering. 9. Epoch of matter-radiation equality. The density of non-relativistic matter decreases more slowly than the density of relativistic radiation. There came a point where the matter density equaled the radiation density, an epoch called matter-radiation equality, after which the matter density exceeded the radiation density. The observed ratio of the density of matter and radiation (CMB) today require that matterradiation equality occurred at a redshift of zeq ≈ 3200, a factor of 3 higher in redshift than recombination at z∗ ≈ 1100. After matter-radiation equality, dark matter perturbations grew more rapidly, linearly instead of just logarithmically with cosmic scale factor. A larger dark matter density causes matterradiation equality to occur earlier. The sound horizon at matter-radiation equality corresponds to a scale roughly around the 2.5’th peak in the CMB power spectrum. For adiabatic fluctuations, the way that the temperature and gravitational perturbations interact when a mode first enters the sound horizon means that the temperature oscillation is 5 times larger for modes that enter the horizon well into the radiationdominated epoch versus well into the matter-dominated epoch. The effect enhances the amplitude of observed CMB peaks higher than 2.5 relative to those lower than 2.5. The observed relative strengths of the 3rd versus the 2nd peak of the CMB power spectrum provides a measurement of the redshift of matter-radiation equality, and direct evidence for the presence of non-baryonic cold dark matter. 10. Sound speed. The density of baryons decreased more slowly than the density of radiation, so that at aroundprecombination the baryon density was becoming comparable to the radiation density. The sound speed p/ρ depends on the ratio of pressure p, which was essentially entirely that of the photons, to the densityqρ, which was produced by both photons and baryons. The sound speed consequently decreased

1 below 3 . Increasing the baryon-to-photon ratio at recombination has several observational effects on the acoustic peaks of the CMB power spectrum, making it a prime measurable parameter from the CMB. First, an increased baryon fraction increases the gravitational forcing (baryon loading), which enhances the compression (odd) peaks while reducing the rarefaction (even) peaks. Second, increasing the baryon fraction reduces the sound speed, which: (a) decreases the amplitude of the radiation dipole relative to the radiation monopole, so increasing the prominence of the peaks; and (b) reduces the oscillation frequency of the photon-baryon fluid, which shifts the peaks to larger scales. The reduced sound speed also causes an adiabatic reduction of the amplitudes of all modes by the square root of the sound speed, but this effect is degenerate with an overall reduction in the initial amplitudes of modes produced by inflation.

11. Recombination. As the temperature cooled below about 3,000 K, electrons combined with hydrogen and helium nuclei into neutral atoms. This drastically reduced the amount of photon-electron scattering, releasing the CMB to propagate almost freely. At the same time, the baryons were released from the photons. Without radiation pressure to support them, fluctuations in the baryons began to grow like the dark matter fluctuations. 12. Neutrinos. Probably all three species of neutrino have mass less than 0.3 eV and were therefore relativistic up to and at the time of recombination. Each of the 3 species of neutrino had an abundance comparable to that of photons, and therefore made an important contribution to the relativistic background and its

356

An overview of cosmological perturbations

fluctuations. Unlike photons, neutrinos streamed freely, without scattering. The relativistic free-streaming of neutrinos provided the main source of the quadrupole pressure that produces a non-vanishing difference Ψ − Φ between the scalar potentials. However, the neutrino quadrupole pressure was still only ∼ 10% of the neutrino monopole pressure. To the extent that the neutrino quadrupole pressure can be approximated as negligible, the neutrinos and their fluctuations can be treated the same as photons. 13. CMB fluctuations. The CMB fluctuations seen on the sky today represent a projection of fluctuations on a thin but finite shell at a redshift of about 1100, corresponding to an age of the Universe of about 400,000 yr. The temperature, and the degrees of polarization in two different directions, provide 3 independent observables at each point on the sky. The isotropy of the unperturbed radiation means that it is most natural to measure the fluctuations in spherical harmonics, which are the eigenmodes of the rotation operator. Similarly, it is natural to measure the CMB polarization in spin harmonics. 14. Matter fluctuations. After recombination, perturbations in the non-baryonic and baryonic matter grew by gravity, essentially unaffected any longer by photon pressure. If one or more of the neutrino types had a mass small enough to be relativistic but large enough to contribute appreciable density, then its relativistic streaming could have suppressed power in matter fluctuations at small scales, but observations show no evidence of such suppression, which places an upper limit of about an eV on the mass of the most massive neutrino. The matter power spectrum measured from the clustering of galaxies contains acoustic oscillations like the CMB power spectrum, but because the non-baryonic dark matter dominates the baryons, the oscillations are much smaller. 15. Integrated Sachs-Wolfe effect. Variations in the gravitational potential along the line-of-sight to the CMB affect the CMB power spectrum at large scales. This is called the integrated Sachs-Wolfe (ISW) effect. If matter dominates the background, then the gravitational potential Φ has the property that it remains constant in time for (subhorizon) linear fluctuations, and there is no ISW effect. In practice, ISW effects are produced by at least three distinct causes. First, an early-time ISW effect is produced by the fact that the Universe at recombination still has an appreciable component of radiation, and is not yet wholly matter-dominated. Second, a late-time ISW effect is produced either by curvature or by a cosmological constant. Third, a non-linear ISW effect is produced by non-linear evolution of the potential.

22 ∗

Cosmological perturbations in a flat Friedmann-Robertson-Walker background

For simplicity, this book considers only a flat (not closed or open) Friedmann-Robertson-Walker background. The comoving cosmological horizon size at recombination was much smaller than today, and consequently the cosmological density Ω was much closer to 1 at recombination than it is today. Since observations indicate that the Universe today is within 1% of being spatially flat, it is an excellent approximation to treat the Universe at the time of recombination as being spatially flat. With some modifications arising from cosmological expansion, perturbation theory on a flat FRW background is quite similar to perturbation theory in flat (Minkowski) space, Chapter 20. The strategy is to start in a completely general gauge, and to discover how the conformal Newtonian gauge, which is used in subsequent Chapters, emerges naturally as that gauge in which the perturbations are precisely the physical perturbations.

22.1 Unperturbed line-element It is convenient to choose the coordinate system xµ = {η, xi } to consist of conformal time η together with Cartesian comoving coordinates x ≡ xi ≡ {x, y, z}. The coordinate metric of the unperturbed background FRW geometry is then  (22.1) ds2 = a(η)2 − dη 2 + dx2 + dy 2 + dz 2 , where a(η) is the cosmic scale factor. The unperturbed coordinate metric is thus the conformal Minkowski metric 0

gµν = a(η)2 ηµν .

(22.2)

The tetrad is taken to be orthonormal, with the unperturbed tetrad axes γm ≡ {γ γ0 .γ γ1 , γ2 , γ3 } being aligned with the unperturbed coordinate axes gµ ≡ {gη , gx , gy , gz } so that the unperturbed vierbein and inverse vierbein are respectively 1/a and a times the unit matrix 0

em µ =

1 µ δ , a m

0m

e

µ

= a δµm .

(22.3)

358



Cosmological perturbations in a flat Friedmann-Robertson-Walker background

Let overdot denote partial differentiation with respect to conformal time η, overdot ≡

∂ , ∂η

(22.4)

so that for example a˙ ≡ da/dη. The coordinate time derivative ∂/∂η is to be distinguished from the directed time derivative ∂0 ≡ e0 µ ∂/∂xµ . Let ∇i denote the comoving gradient ∇i ≡

∂ , ∂xi

(22.5)

which should be distinguished from the directed derivative ∂i ≡ ei µ ∂/∂xµ .

22.2 Comoving Fourier modes Since the unperturbed Friedmann-Robertson-Walker spacetime is spatially homogeneous and isotropic, it is natural to work in comoving Fourier modes. Comoving Fourier modes have the key property that they evolve independently of each other, as long as perturbations remain linear. Equations in Fourier space are obtained by replacing the comoving spatial gradient ∇i by −i times the comoving wavevector ki (the choice of sign is the standard convention in cosmology) ∇i → −iki .

(22.6)

By this means, the spatial derivatives become algebraic, so that the partial differential equations governing the evolution of perturbations become ordinary differential equations. In what follows, the comoving gradient ∇i will be used interchangeably with −iki , whichever is most convenient.

22.3 Classification of vierbein perturbations The tetrad-frame components ϕmn of the vierbein perturbation of the FRW geometry decompose in much the same way as in flat Minkowski case into 6 scalars, 4 vectors (8 degrees of freedom), and 1 tensor (2 degrees of freedom) (the following equations are essentially the same as those (20.6) for the flat Minkowski background), ϕ00 = ψ

,

(22.7a)

scalar

ϕ0i = ∇i w + wi ,

(22.7b)

ϕi0 = ∇i w ˜+ w ˜i ,

(22.7c)

scalar

scalar

vector

vector

˜ + ∇i hj + ∇j ˜hi + hij . ϕij = δij φ + ∇i ∇j h + εijk ∇k h scalar

scalar

scalar

vector

vector

tensor

(22.7d)

22.3 Classification of vierbein perturbations

359

The tetrad-frame components ǫm of the coordinate shift of the coordinate gauge transformation (18.8) similarly decompose into 2 scalars and 1 vector (the following equation is essentially the same as that (20.8) for the flat Minkowski background) ǫm = { ǫ0 , scalar

∇i ǫ scalar

+ ǫi } .

(22.8)

vector

The vierbein perturbations ϕmn transform under a coordinate gauge transformation (18.8) as, equation (18.25), 1 ∂ǫ0 , a ∂η scalar     1 ∂ a˙  a˙  1 ∂ ϕ0i → ∇i w + − − ǫ + wi + ǫi , a ∂η a a ∂η a vector scalar   1 ˜i , ϕi0 → ∇i w ˜ + ǫ0 + w a vector scalar       a˙ 1 ˜ + ∇i hj + 1 ǫj + ∇j h ˜ i + hij , ϕij → δij φ − 2 ǫ0 + ∇i ∇j h + ǫ + εijk ∇k h a a a tensor vector scalar

ϕ00 → ψ +

scalar

(22.9a) (22.9b)

(22.9c)

(22.9d)

vector

scalar

or equivalently 1 ∂ǫ0 , a ∂η 1 ∂ 1 ∂ a˙  a˙  ǫ , wi → wi + ǫi , w→w+ − − a ∂η a a ∂η a 1 w ˜→w ˜ + ǫ0 , w ˜i → w ˜i , a a˙ 1 ˜→h ˜ , hi → hi + 1 ǫ i , φ → φ − 2 ǫ0 , h → h + ǫ , h a a a ψ→ψ+

(22.10a) (22.10b) (22.10c) ˜hi → h ˜i ,

hij → hij .

(22.10d)

Eliminating the coordinate shift ǫm from the transformations (22.10) yields 12 coordinate gauge-invariant combinations of the perturbations ψ−

∂ a˙  + w ˜, ∂η a

w − h˙ ,

wi − h˙ i ,

w ˜i ,

a˙ ˜, φ+ w a

˜ , h

˜hi ,

hij .

(22.11)

Six combinations of these coordinate gauge-invariant perturbations depend only on the symmetric part ϕmn + ϕnm of the vierbein perturbations, and are therefore tetrad gauge-invariant as well as coordinate gauge-invariant. These 6 coordinate and tetrad gauge-invariant perturbations comprise 2 scalars, 1 vector,

360



Cosmological perturbations in a flat Friedmann-Robertson-Walker background

and 1 tensor ∂ a˙  ˙ , (w + w ˜ − h) + ∂η a

Ψ

≡ ψ−

Φ

scalar

a˙ ˙ , ˜ − h) ≡ φ + (w + w a

(22.12b)

Wi vector

˜˙ i , ≡ wi + w ˜i − h˙ i − h

(22.12c)

hij tensor

.

(22.12d)

scalar

(22.12a)

22.4 Metric, tetrad connections, and Einstein tensor This section gives expressions in a completely general gauge for perturbed quantities in the flat FriedmannRobertson-Walker background geometry. The perturbed coordinate metric gµν is gηη = −a2 (1 + 2 ψ) ,   ˜ + (wi + w ˜i ) , gηi = −a2 ∇i (w + w)   ˜ j ) − ∇j (hi + h ˜ i ) − 2 hij . gij = a2 (1 − 2 φ)δij − 2 ∇i ∇j h − ∇i (hj + h

The coordinate metric is tetrad gauge-invariant, but not coordinate gauge-invariant. The perturbed tetrad connections Γkmn are      ∂ 1 ∂ a˙  a˙  Γ0i0 = − ∇i ψ − w ˜ + w ˜i , + + a ∂η a ∂η a    1 1 a˙ ˙ ˙ Γ0ij = ˜i + hij , − + F δij − ∇i ∇j (w − h) − (∇i Wj + ∇j Wi ) + ∇j w a a 2   ∂ 1 1 ˜i) , ˜ − ∇i ˜hj + ∇j h (∇i Wj − ∇j Wi ) − (εijl ∇l h Γij0 = a 2 ∂η   1 a˙  a˙ Γijk = (δjk ∇i − δik ∇j ) φ + w ˜ − (δik δjl − δjk δil )w˜l a a a i ˜ i ) + ∇i hjk − ∇j hik , ˜ j + ∇j h ˜ − ∇i h − ∇k (εijl ∇l h

(22.13a) (22.13b) (22.13c)

(22.14a) (22.14b) (22.14c)

(22.14d)

where F is defined by

F ≡

a˙ ψ + φ˙ . a

(22.15)

Being purely tetrad-frame quantities, the tetrad connections are automatically coordinate gauge-invariant,

22.4 Metric, tetrad connections, and Einstein tensor

361

but they are not tetrad gauge-invariant. The quantity F defined by equation (22.15) is not coordinate gaugeinvariant, but the combination − a/a ˙ 2 + F/a that appears in the expression (22.14b) for Γ0ij is coordinate and tetrad gauge-invariant. Exercise 22.1 Coordinate gauge invariance of − a/a ˙ 2 + F/a. Argue that under a coordinate gauge η ′ transformation of the conformal time, η → η = η + ǫ , the cosmic scale factor a(η) and its derivative a˙ ≡ da/dη transforms as (see §18.8) a˙ a → a + Lǫ a = a − aǫ ˙ η = a + ǫ0 , a

a ¨ a˙ → a˙ + Lǫ a˙ = a˙ − a ¨ǫη = a˙ + ǫ0 . a

(22.16)

Check that this behaviour is consistent with the gauge transformation (18.28) of gηη , equation (22.13a). Hence show that − a/a ˙ 2 + F/a is coordinate gauge invariant. THIS IS NOT WORKING. The perturbed tetrad-frame Einstein tensor Gmn is   1  a˙ 2 a˙ G00 = 2 3 2 − 6 F + 2 ∇2 Φ , (22.17a) a a a scalar     a a 1 2 ¨ a˙ 2  a˙ 2   ¨ 1  ˜ + ∇ Wi + 2 ˜i , (22.17b) −2 2 w −2 2 w G0i = 2 2 ∇i F + a a a 2 a a vector scalar    a ∂ a ¨ a˙ 2 ¨ a˙  a˙ 2  1 F +2 +2 − 2 2 ψ δij − (∇i ∇j − δij ∇2 )(Ψ − Φ) Gij = 2  − 2 + 2 + 2 a a a ∂η a a a scalar scalar   ∂2  1 ∂ a˙ ∂ a˙  2 + (22.17c) (∇i Wj + ∇j Wi ) − hij  . + 2 +2 − ∇ 2 ∂η a ∂η 2 a ∂η vector

tensor

Being tetrad-frame quantities, all components of the tetrad-frame Einstein tensor are automatically coordinate gauge-invariant. The time-time G00 and space-space Gij components are not only coordinate but also tetrad gauge-invariant, as follows from the fact that these components depend only on symmetric combinations of the vierbein potentials. Specifically, the quantities 3(a˙ 2 /a4 ) − 6(a/a ˙ 3 )F on the right hand side of equation (22.17a) for G00 , and the coefficient of δij (including the overall 1/a2 factor) on the right hand side of equation (22.17c) for Gij , are coordinate and tetrad gauge-invariant. However, the time-space components G0i are not tetrad gauge-invariant, as is evident from the fact that equation (22.17b) involves the non-tetrad-gauge-invariant perturbations w ˜ and w ˜i . Physically, under a tetrad boost by a velocity v of linear order, the time-space components G0i change by first order v, but G00 and Gij change only to second order v 2 . Thus to linear order, only G0i changes under a tetrad boost. Note that G0i changes under a tetrad boost ˜ and ˜hi ). (w ˜ and w ˜i ), but not under a tetrad rotation (h

362



Cosmological perturbations in a flat Friedmann-Robertson-Walker background

22.5 ADM gauge choices The ADM (3+1) formalism, Chapter 13, chooses the tetrad time axis γ0 to be orthogonal to hypersurfaces of constant time, η = constant, equivalent to requiring that the tetrad time axis be orthogonal to each of the spatial coordinate axes, γ0 · gi = 0, equation (13.2). The ADM choice is equivalent to setting w ˜=w ˜i = 0 .

(22.18)

The ADM choice simplifies the tetrad-frame connections (22.14) and the time-space component G0i of the tetrad-frame frame Einstein tensor, equation (22.17b). Another gauge choice that significantly simplifies the tetrad connections (22.14), though does not affect the Einstein tensor (22.17), is ˜h = h ˜i = 0 . (22.19) If the wavevector k is taken along the coordinate z-direction, then the gauge choice ˜hi = 0 is equivalent to choosing the tetrad 3-axis (z-axis) γ3 to be orthogonal to the coordinate x and y-axes, γ3 · gx = γ3 · gy = 0. The gauge choice ˜ h = 0 is equivalent to rotating the tetrad axes about the 3-axis (z-axis) so that γ1 · gy = γ2 · gx .

22.6 Conformal Newtonian gauge Conformal Newtonian gauge sets ˜ = hi = h ˜i = 0 , w=w ˜=w ˜i = h = h

(22.20)

so that the retained perturbations are the 6 coordinate and tetrad gauge-invariant perturbations (22.12) Ψ

= ψ,

(22.21a)

Φ

= φ,

(22.21b)

Wi vector

= wi ,

(22.21c)

hij tensor

.

(22.21d)

scalar scalar

In conformal Newtonian gauge, the scalar perturbations of the Einstein equations are the energy density, energy flux, monopole pressure, and quadrupole pressure equations 1 a˙ − 3 F − k 2 Φ = 4πGa2 T 00 , a 1 ikF = 4πGa2 kˆi T 0i , a 1 a˙ k2 a˙ 2  4 ¨ F˙ + 2 F + − 2 2 Ψ − (Ψ − Φ) = Gπa2 δij T ij , a a a 3 3  2 k (Ψ − Φ) = 8πGa2 3 kˆi kˆj −

2

(22.22a) (22.22b) (22.22c) 1 2



1

δij T ij ,

(22.22d)

22.7 Synchronous gauge

363

where F is the coordinate and tetrad gauge-invariant quantity a˙ Ψ + Φ˙ . a

F ≡

(22.23)

All 4 of the scalar Einstein equations (22.22) are expressed in terms of gauge-invariant variables, and are therefore fully gauge-invariant. The energy and momentum equations (22.22a) and (22.22b) can be combined to eliminate F , yielding an equation for Φ alone ! 1 a˙ kˆi 1 0i 00 2 2 −k Φ = 4πa T + 3 . (22.24) T a ik The quantity in parentheses on the right hand side of equation (22.24) is the source for the scalar potential Φ, and can be interpreted as a measure of the “true” energy fluctuation. This is however just a matter of 1

1

interpretation: the individual perturbations T 00 and T 01 are both individually gauge-invariant, and therefore have physical meaning. If the energy-momentum tensors of the various matter components are arranged so as to conserve overall energy-momentum, as they should, then 2 of the 4 equations (22.22a)–(22.22d) are redundant, since they serve simply to enforce conservation of energy and scalar momentum. Thus it suffices to take, as the equations governing the potentials Φ and Ψ, any 2 of the equations (22.22a)–(22.22d). One is free to retain whichever 2 of the equations is convenient. Usually the 1st equation, the energy equation (22.22a), and the 4th equation, the quadrupole pressure equation (22.22d), are most convenient to retain. But sometimes the 2nd equation, the scalar momentum equation (22.22b), is more convenient in place of the energy equation (22.22a).

22.7 Synchronous gauge One gauge that remains in common use in cosmology, but is not used here, is synchronous gauge, discussed in the case of Minkowski background space in §20.10. The cosmological synchronous gauge choices are the same as for the Minkowski background, equations (20.63) and (20.64): ˜i = 0 . ψ=w=w ˜ = wi = w ˜i = ˜h = h

(22.25)

The gauge-invariant perturbations (22.12) in synchronous gauge are Ψ

scalar

Φ

scalar

Wi vector hij tensor

¨ + a˙ h˙ , = h a a˙ = φ − h˙ , a ˙ = − hi ,

(22.26b)

.

(22.26d)

(22.26a)

(22.26c)

23 Cosmological perturbations: a simplest set of assumptions

1. Consider only scalar modes. 2. Consider explicitly only two species: non-baryonic cold dark matter, and radiation consisting of photons and neutrinos lumped together. Neglect the contribution of baryons to the mass density. 3. Treat the radiation as almost isotropic, so it is dominated by its first two moments, the monopole and dipole. In practice, electron-photon scattering keeps photons almost isotropic. Unlike photons, neutrinos stream freely, but they inherit an approximately isotropic distribution from an early time when they were in thermodynamic equilibrium. 4. Include damping from photon-electron (Thomson) scattering by allowing the radiation a small quadrupole moment, the diffusion approximation.

23.1 Perturbed FRW line-element Perturbed FRW line-element in conformal Newtonian gauge   ds2 = a2 −(1 + 2Ψ)dη 2 + δij (1 − 2Φ)dxi dxj ,

(23.1)

where a(η) is the cosmic scale factor, a function only of conformal time η.

23.2 Energy-momenta of ideal fluids In the simplest approximation, matter, radiation, and dark energy can each be treated as ideal fluids. The energy-momentum tensor of an ideal fluid with proper density ρ and isotropic pressure p in its own rest frame, and moving with bulk 4-velocity um relative to the conformal Newtonian tetrad frame, is T mn = (ρ + p)um un + p η mn .

(23.2)

In the situation under consideration, the fluids satisfy equations of state p = wρ

(23.3)

23.2 Energy-momenta of ideal fluids

365

with w constant. Specifically, w = 0 for non-relativistic matter, w = 1/3 for relativistic radiation, and w = −1 for dark energy with constant density. Furthermore, all the fluids are moving with non-relativistic bulk velocities, including the radiation, which is almost isotropic, and therefore has a small bulk velocity even though the individual particles of radiation move at the speed of light. The bulk tetrad-frame 4-velocity um is thus, to linear order um = {1, vi } ,

(23.4)

where vi is the non-relativistic spatial bulk 3-velocity (the spatial tetrad metric is Euclidean, so vi = vi ). The density ρ can be written in terms of the unperturbed density ρ¯ and a fluctuation δ defined by ρ = ρ¯[1 + (1 + w)δ] .

(23.5)

The factor 1+w is included in the definition of the fluctuation δ because it simplifies the resulting perturbation equations (23.12). As you will discover in Exercise 23.1, the fluctuation δ can be interpreted physically as the entropy fluctuation, δ=

1 δρ δs = . 1 + w ρ¯ s¯

(23.6)

For matter, w = 0, the entropy fluctuation coincides with the density fluctuation, δ = δρ/ρ¯. For dark energy, w = −1, the density fluctuation is necessarily zero, δρ/ρ¯ = 0. To linear order in the velocity vi , the tetrad-frame energy-momentum tensor (23.2) of the ideal fluid is then T 00 ≡ ρ¯[1 + (1 + w)δ] , T

0i

T

ij

≡ (1 + w)¯ ρ vi ,

= wρ¯[1 + (1 + w)δ] δij .

(23.7a) (23.7b) (23.7c)

To linear order in the fluctuation δ, velocity vi , and potentials Ψ and Φ, and in conformal Newtonian gauge, conservation of energy and momentum requires   a˙  1 − Ψ ∂ρ m0 ˙ + (1 + w)∇i (ρvi ) + 3(1 + w)ρ −Φ =0 , (23.8a) Dm T = a ∂η a   1 ∂ρvi a˙ Dm T mi = (1 + w) + 4(1 + w) ρvi + w∇i ρ + (1 + w)ρ∇i Ψ = 0 . (23.8b) a ∂η a The energy conservation equation (23.8a) has an unperturbed part,   0 a˙ 1 ∂ ρ¯ =0, + 3(1 + w)¯ ρ Dm T m0 = a ∂η a

(23.9)

which implies the usual result that the mean density evolves as a power law with cosmic scale factor, ρ¯ ∝ a−3(1+w) . Subtracting appropriate amounts of the unperturbed energy conservation equation (23.9) from the perturbed energy-momentum conservation equations (23.8) yields equations for the fluctuation δ

366

Cosmological perturbations: a simplest set of assumptions

and velocity vi : ˙ , δ˙ + ∇i vi = 3Φ

(23.10a) a˙ v˙i + (1 − 3w) vi + w∇i δ = −∇i Ψ . (23.10b) a Now decompose the 3-velocity vi into its scalar v and vector v⊥,i parts. Up to this point, the scalar part of a vector has been taken to be the gradient of a potential. But here it is advantageous to absorb a factor of k into the definition of the scalar part v of the velocity, so that instead of vi = −iki v + v⊥,i in Fourier space, the velocity is given in Fourier space by vi = −ikˆi v + v⊥,i .

(23.11)

The advantage of this choice is that v is dimensionless, as are δ and Ψ and Φ. The scalar parts of the perturbation equations (23.10) are then ˙ , δ˙ − kv = 3Φ

(23.12a)

a˙ (23.12b) v˙ + (1 − 3w) v + wkδ = −kΨ . a Combining the two equations (23.12) for the fluctuation δ and velocity v yields a second-order differential equation for δ − 3Φ,   2 a˙ d d 2 + (1 − 3w) + k w (δ − 3Φ) = −k 2 (Ψ + 3wΦ) . (23.13) dη 2 a dη √ For positive w, equation (23.13) is a wave equation for a damped, forced oscillator with sound speed w. The resulting generic behaviour for the particular cases of matter (w = 0) and radiation (w = 13 ) is considered in §23.6 and §23.7 below. A more careful treatment, deferred to Chapter 24, accounts for the complete momentum distribution of radiation by expanding the temperature perturbation Θ ≡ δT /T¯ in multipole moments, equation (24.48). The radiation fluctuation δr and scalar bulk velocity vr are related to the first two multipole moments of the temperature perturbation, the monopole Θ0 and the dipole Θ1 , by δr = 3Θ0 ,

(23.14a)

vr = 3Θ1 .

(23.14b)

The factor of 3 arises because the unperturbed radiation distribution is in thermodynamic equilibrium, for which the entropy density is s ∝ T 3 , so δr = 3δT /T¯. 1

The energy-momentum perturbation T mn that goes into the Einstein equations (22.22) are, from equations (23.7) with the unperturbed part subtracted, 1

T 00 ≡ (1 + w)¯ ρδ ,

(23.15a)

T 0i ≡ (1 + w)¯ ρ vi ,

(23.15b)

1

1

T

ij

= w(1 + w)¯ ρ δ δij .

(23.15c)

23.3 Diffusive damping

367

Exercise 23.1 Entropy fluctuation. The purpose of this exercise is to discover that the fluctuation δ defined by equation (23.5) can be interpreted as the entropy fluctuation. According to the first law of thermodynamics, the entropy density s of a fluid of energy density ρ, pressure p, and temperature T in a volume V satisfies d(ρV ) + pdV = T d(sV ) .

(23.16)

If the fluid is ideal, so that ρ, p, T , and s are independent of volume V , then integrating the first law (23.16) implies that ρV + pV = T sV .

(23.17)

This implies that the entropy density s is related to the other variables by s=

ρ+p . T

(23.18)

Show that, for ideal fluid with equation of state p/ρ = w = constant, the first law (23.16) together with the expression (23.18) for entropy implies that T ∝ ρw/(1+w) ,

(23.19)

s ∝ ρ1/(1+w) .

(23.20)

and hence

Conclude that small variations of the entropy and density are related by 1 δρ δs = , s 1+w ρ

(23.21)

confirming equation (23.6). [Hint: Do not confuse what is being asked here with adiabatic expansion. The results (23.19) are properties of the fluid, independent of whether the fluid is changing adiabatically. For adiabatic expansion, the fluid satisfies the additional condition sV = constant.]

23.3 Diffusive damping The treatment of matter and radiation as ideal fluids misses a feature that has a major impact on observed fluctuations in the CMB, namely the diffusive damping of sound waves that results from the finite mean free path to electron-photon scattering. As recombination approaches, the scattering mean free path lengthens, until at recombination photons are able to travel freely across the Universe, ready to be observed by astronomers. The damping is greater at smaller scales, and is responsible for the systematic decrease in the CMB power spectrum to smaller scales. As expounded in Chapter 24, in §24.13 and following, the damping can be taken into account to lowest

368

Cosmological perturbations: a simplest set of assumptions

order, the diffusion approximation, by admitting a small quadrupole moment Θ2 to the photon distribution. A detailed analysis of the collisional Boltzmann equation for photons, §24.12, reveals that the photon quadrupole Θ2 is given in the diffusion approximation by equation (24.94). There is an additional source of damping that arises from viscous baryon drag, §24.15, but this effect vanishes in the limit of small baryon density, and is neglected in the present Chapter. The diffusive damping resulting from a small photon quadrupole conserves the energy and momentum of the photon fluid, so that covariant momentum conservation Dm T mn = 0 continues to hold true within the photon fluid. By contrast, viscous baryon drag, §24.15, neglected in this Chapter, transfers momentum between photons and baryons. Define the dimensionless quadrupole q by     ij ρq ≡ kˆi kˆj − 13 δij T ij . (23.22) Tquadrupole = (1 + w)¯ ρq 32 kˆi kˆj − 21 δij , (1 + w)¯ For photons, the dimensionless quadrupole q is related to the photon quadrupole harmonic Θ2 by, equation (24.55d), q = − 2Θ2 .

(23.23)

In the presence of a quadrupole pressure, the energy conservation equation (23.8a) is unchanged, but the momentum conservation equation (23.8b) is modified by the change Ψ → Ψ + q:   ∂ρvi a˙ 1 mi (1 + w) + 4(1 + w) ρvi + w∇i ρ + (1 + w)ρ∇i (Ψ + q) = 0 . (23.24) Dm T = a ∂η a The consequent equations (23.10b), (23.12b), (23.13) are similarly modified by Ψ → Ψ + q. In particular, the velocity equation (23.12b) is modified to a˙ v˙ + (1 − 3w) v + wkδ = −k(Ψ + q) . a

(23.25)

23.4 Equations for the simplest set of assumptions Non-baryonic cold dark matter, subscripted c: ˙ , δ˙c − k vc = 3 Φ a˙ v˙ c + vc = −kΨ . a Radiation, which includes both photons and neutrinos: ˙ 0 − k Θ1 = Φ ˙ , Θ ˙ 1 + k Θ0 = − k (Ψ − 2Θ2 ) , Θ 3 3 4k Θ2 = − Θ1 . 9¯ ne σT a

(23.26a) (23.26b)

(23.27a) (23.27b) (23.27c)

23.4 Equations for the simplest set of assumptions

369

Einstein energy and quadrupole pressure equations: a˙ F = 4πGa2 (¯ ρc δc + 4ρ¯r Θ0 ) , a k 2 (Ψ − Φ) = − 32πGa2 ρ¯r Θ2 ,

− k2Φ − 3

(23.28a) (23.28b)

where F ≡

a˙ Ψ + Φ˙ . a

(23.29)

In place of the Einstein energy equation (23.28a) it is sometimes convenient to use the Einstein momentum equation −kF = 4πGa2 (¯ ρc vc + 4ρ¯r Θ1 ) ,

(23.30)

which, because the matter and radiation equations (23.26) and (23.27) already satisfy covariant energymomentum conservation, is not an independent equation. The radiation quadrupole Θ2 , equation (23.27c), derived in Chapter 24, equation (24.94), is proportional to the comoving mean free path lT to electron-photon (Thomson) scattering, lT ≡

1 , n ¯ e σT a

(23.31)

where σT is the Thomson cross-section. The quadrupole becomes important only near recombination, where the increasing mean free path to electron-photon scattering leads to dissipation of photon-baryon sound waves. In the simple treatment of this Chapter, neutrinos are being lumped with photons, and of course neutrinos do not scatter, but rather stream freely. However, radiation is gravitationally sub-dominant near recombination, so not much error arises from treating neutrinos as gravitationally the same as photons near recombination. To make comparison with observed CMB fluctuations, the important thing is to follow the evolution of the photon multipoles, and for this purpose the radiation quadrupole defined by equation (23.27c), without any correction for neutrino-to-photon ratio, is the appropriate choice. In much of the remainder of this Chapter, that is, excepting in §23.7 and §23.14, the radiation quadrupole will be set to zero, Θ2 = 0 ,

(23.32)

which is equivalent to neglecting the effect of damping. If the radiation quadrupole vanishes, then the Einstein quadrupole pressure equation (23.28b) implies that the scalar potentials Ψ and Φ are equal, Ψ=Φ. In any case, the radiation quadrupole is always small, so that Ψ ≈ Φ to a good approximation.

(23.33)

370

Cosmological perturbations: a simplest set of assumptions

23.5 Unperturbed background In the unperturbed background, the unperturbed dark matter density ρ¯c and radiation density ρ¯r evolve with cosmic scale factor as ρ¯c ∝ a−3 ,

ρ¯r ∝ a−4 .

(23.34)

The Hubble parameter H is defined in the usual way to be H≡

1 da a˙ = 2 , a dt a

(23.35)

in which overdot represents differentiation with respect to conformal time, a˙ ≡ da/dη. The Friedmann equations for the background imply that the Hubble parameter for a universe dominated by dark matter and radiation is ! 2 a3eq a4eq Heq 8πG 2 (23.36) + 4 (¯ ρc + ρ¯r ) = H = 3 2 a3 a where aeq and Heq are the cosmic scale factor and the Hubble parameter at the time of matter-radiation equality, ρ¯c = ρ¯r . The comoving horizon distance η is defined to be the comoving distance that light travels starting from zero expansion. This is ! √ √ r  Z a a 2 2 a/aeq 2 2 da p 1+ . (23.37) −1 = = η= 2 aeq Heq aeq aeq Heq 1 + 1 + a/aeq 0 a H

In the radiation- and matter-dominated epochs respectively, the comoving horizon distance η is  √   a 2    ∝a radiation-dominated ,  a H aeq eq eq η= √  1/2  2 2 a   ∝ a1/2 matter-dominated .  aeq Heq aeq

(23.38)

The ratio of the comoving horizon distance η to the comoving cosmological horizon distance 1/(aH) is p 2 1 + a/aeq p ηaH = , (23.39) 1 + 1 + a/aeq

which is evidently a number of order unity, varying between 1 in the radiation-dominated epoch a ≪ aeq , and 2 in the matter-dominated epoch a ≫ aeq . Exercise 23.2 Matter-radiation equality. 1. Argue that the redshift zeq of matter-radiation equality is given by 1 + zeq =

a0 = ? Ωm h 2 , aeq

(23.40)

23.6 Generic behaviour of non-baryonic cold dark matter

371

where Ωm is the matter density today relative to critical. What is the factor, and what is its numerical value? The factor depends on the energy-weighted effective number of relativistic species gρ , Exercise 10.18. Should this gρ be that now, or that at matter-radiation equality? 2. Show that the ratio Heq /H0 of the Hubble parameter at matter-radiation equality to that today is Heq p = 2Ωm (1 + zeq )3/2 . H0

(23.41)

Solution. The redshift zeq of matter-radiation equality is given by 1 + zeq

 g −1 2 Ωm 45c5 ~3 Ωm H02 ρ 4 Ωm h = = 3 = 8.093 × 10 = 3200 Ωr 4π Ggρ (kT0 )4 gρ 3.36

where T0 = 2.725 K is the present-day CMB temperature, and gρ = 2 + 6 78 weighted effective number of relativistic species at matter-radiation equality.



Ωm h 2 0.133

 4 4/3 11



,

(23.42)

= 3.36 is the energy-

23.6 Generic behaviour of non-baryonic cold dark matter Combining equations (23.26) for the dark matter overdensity and velocity gives  2  d a˙ d (δc − 3Φ) = −k 2 Ψ = −k 2 Φ , + dη 2 a dη

(23.43)

where the last expression follows because Ψ = Φ to a good approximation. In the absence of a driving potential, Φ = 0, the dark matter velocity would redshift as vc ∝ 1/a, and the dark matter density equation (23.26a) would then imply that δ˙c = kvc ∝ a−1 . In the radiation-dominated epoch, where η ∝ a, this leads to a logarithmic growth in the overdensity δc , even though there is no driving potential, and the velocity is redshifting to a halt. In the matter-dominated epoch, where η ∝ a1/2 , the dark matter overdensity δc would freeze out at a constant value, in the absence of a driving potential. Exercise 23.3 Generic behaviour of dark matter. Find the homogeneous solutions of equation (23.43). Hence find the retarded Green’s function of the equation. Write down the general solution of equation (23.43) as an integral over the Green’s function. Solution. The general solution of equation (23.43) is √  √  Z a  √ ( 1 + a + 1) ( 1 + a′ − 1) a′ da′ 1+a+1 2 ′ √ √ + 2k ln √ δc (a)− 3Φ(a) = A0 + A1 ln √ Φ(a ) , (23.44) 1+a−1 ( 1 + a − 1) ( 1 + a′ − 1) 1 + a′ 0 where A0 and A1 are constants.

372

Cosmological perturbations: a simplest set of assumptions

23.7 Generic behaviour of radiation Combining equations (23.27) for the radiation monopole, dipole, and quadrupole gives   2 √ d d 2 + k (Θ0 − Φ) = − k 2 (Ψ + Φ) = − 2 k 2Φ , 3 2 + 2 3κ dη dη

(23.45)

where in the last expression the approximation Ψ = Ψ has again been invoked. The coefficient κ of the linear derivative term in equation (23.45) is a damping coefficient κ≡

4k 2 lT √ , 9 3

(23.46)

where lT is the comoving electron-photon scattering (Thomson) mean free path, equation (23.31), The mean free path is small, Exercise 23.6, except near recombination. In the absence of a driving potential, q Φ = 0, and in the absence of damping, κ = 0, the radiation oscillates as Θ0 ∝ e±iωη with frequency ω = other words, the solutions are sound waves, moving at the sound speed r 1 ω . cs = = k 3

1 3

k. In

(23.47)

Define the conformal sound time by η ηs ≡ cs η = √ . 3

(23.48)

In terms of the conformal sound time ηs , the differential equation (23.45) becomes  2  d d 2 + 2κ + k (Θ0 − Φ) = − 2k 2 Φ . dηs2 dηs

(23.49)

As you will discover in Exercises 23.7 and 23.5, equation (23.49) describes damped oscillations forced by the potential on the right hand side. Exercise 23.4 Generic behaviour of radiation. Find the homogeneous solutions of equation (23.49) in the case of zero damping, κ = 0. Hence find the retarded Green’s function of the equation. Write down the general solution of equation (23.49) as an integral over the Green’s function. Convince yourself that Θ0 − Φ oscillates about −2Φ. Solution. The general solution of equation (23.49) is, with α ≡ kηs , Z α Θ0 (α) − Φ(α) = B0 cos α + B1 sin α − 2 sin(α − α′ )Φ(α′ ) dα′ , (23.50) 0

where B0 and B1 are constants. Exercise 23.5 Behaviour of radiation in the presence of damping. Now suppose that there is a R small damping coefficient, κ ≪ k. Try a solution of the form Θ0 − Φ ∝ e ω dηs in equation (23.49). Suppose that the frequency ω changes slowly over a period, ω ′ ≪ ω 2 , so that ω ′ can be set to zero. Show that

23.8 Regimes

373 R

the homogeneous solutions of equation (23.49) are approximately Θ0 − Φ ∝ e− κ dηs ± ikηs . Hence find the retarded Green’s function, and write down the general solution to equation (23.49). R Solution. See §24.17. The general solution of equation (23.49) is, with α ≡ kηs and β ≡ κ dηs , Z α ′ −β Θ0 (α) − Φ(α) = e (B0 cos α + B1 sin α) − 2 e−(β−β ) sin(α − α′ )Φ(α′ ) dα′ , (23.51) 0

where B0 and B1 are constants.

23.8 Regimes In the remainder of this Chapter, approximate analytic solutions are developed that describe the evolution of perturbations in the matter and radiation in various regimes. The regimes are: 1. Superhorizon scales, §23.9. 2. Radiation-dominated: a. adiabatic initial conditions, §23.10; b. isocurvature initial conditions, §23.11. 3. Subhorizon scales, §23.12. 4. Matter-dominated, §23.13. 5. Recombination §23.14. 6. Post-recombination §23.15. 7. Matter with dark energy §23.16. 8. Matter with dark energy and curvature §23.17.

23.9 Superhorizon scales At sufficiently early times, any mode is outside the horizon, kη < 1. In the superhorizon limit kη ≪ 1, the evolution equations (23.27)–(23.28) reduce to ˙ , δ˙c = 3Φ (23.52a) ˙ ˙ Θ0 = Φ , (23.52b) a˙ ρc δc + 4ρ¯r Θ0 ) . (23.52c) − 3 F = 4πGa2 (¯ a The first two of these equations evidently imply that the dark matter overdensity δc and radiation monopole Θ0 are related to the potential Φ by δc = 3Φ + constant , Θ0 = Φ + constant .

(23.53a) (23.53b)

374

Cosmological perturbations: a simplest set of assumptions

→

superhorizon

on

iz or

radiation

radiation background matter fluctuation

Comoving distance

h

matter

aeq Cosmic scale factor

arec →

Figure 23.1 Various regimes in the evolution of fluctuations. The line increasing diagonally from bottom left to top right is the comoving horizon distance η. Above this line are superhorizon fluctuations, whose comoving wavelengths exceed the horizon distance, while below the line are subhorizon fluctuations, whose comoving wavelengths are less than the horizon distance. The vertical line at cosmic scale factor aeq ≈ a0 /3200 marks the moment of matter-radiation equality. Before matter-radiation equality (to the left), the background mass-energy is dominated by radiation, while after matter-radiation equality (to the right), the background mass-energy is dominated by matter. Once a fluctuation enters the horizon, the non-baryonic matter fluctuation tends to grow, whereas the radiation fluctuation tend to decay, so there is an epoch prior to matter-radiation equality where gravitational perturbations are dominated by matter rather than radiation fluctuations, even though radiation dominates the background energy density. The vertical line at a∗ ≈ a0 /1100 marks recombination, where the temperature has cooled to the point that baryons change from being mostly ionized to mostly neutral, and the Universe changes from being opaque to transparent. The observed CMB comes from the time of recombination.

In effect, the dark matter velocity vc and radiation dipole Θ1 are negligibly small at superhorizon scales, vc = Θ 1 = 0 .

(23.54)

Plugging the solutions (23.53) into the Einstein energy equation (23.52c), and replacing derivatives with respect to conformal time η with derivatives with respect to cosmic scale factor a, ∂ ∂ ∂ = a˙ = a2 H , ∂η ∂a ∂a

(23.55)

23.9 Superhorizon scales

375

adiabatic

Φ / Φ(late)

1.0

.5

.0

isocurvature

10 102 103 10−3 10−2 10−1 1 Cosmic scale factor a / aeq Figure 23.2 Evolution of the scalar potential Φ at superhorizon scales, from radiation-dominated to matterdominated. The scale for the potential is normalized to its value Φ(late) at late times a ≫ aeq .

with the Hubble parameter H from equation (23.36) gives the first order differential equation, in units aeq = Heq = 1, 2a(1 + a)Φ′ + (6 + 5a)Φ + 4C0 + C1 a = 0 ,

(23.56)

where prime ′ denotes differentiation with respect to cosmic scale factor, d/da, and the constants C0 and C1 are C0 = Θ0 (0) − Φ(0) ,

C1 = δc (0) − 3Φ(0) .

(23.57)

The constants C0 and C1 are set by initial conditions. There are adiabatic and isocurvature initial conditions. Inflation generically produces adiabatic fluctuations, in which matter and radiation fluctuate together δc (0) = 3 Θ0 (0) = − 32 Φ(0) adiabiatic .

(23.58)

Notice that a positive energy fluctuation corresponds to a negative potential, consistent with Newtonian intuition. Isocurvature initial conditions are defined by the vanishing of the initial potential, Φ(0) = 0. This, together with equations (23.56) and (23.57), implies the isocurvature initial conditions Φ(0) = Θ0 (0) = 0 ,

δc (0) = − 8 Φ′ (0) = − 8 Θ′0 (0) isocurvature .

(23.59)

376

Cosmological perturbations: a simplest set of assumptions

Subject to the condition that Φ remains finite at a → 0, the adiabatic and isocurvature solutions to equation (23.56) are, in units aeq = 1,   √ √   Φ(0) 2 Φ(0) 3(14 + 9a) + (38 + 9a) 1 + a 16 16 1 + a 8 Φad = 9+ − 2 − 3 + = , (23.60a) 3 √ 10 a a a a3 10 1+ 1+a  √ √   2 8 Φ′ (0) a 6 + a + 4 1 + a 16 16 1 + a 8 8 Φ′ (0) 1− + 2 − 3 + = Φiso = , (23.60b) 4 √ 5 a a a a3 5 1+ 1+a

in which the last expressions in each case are written in a form that is numerically well-behaved for all a. Figure 23.2 shows the evolution of the potential Φ from equations (23.60), normalized to the value Φ(late) at late times a ≫ aeq . For adiabatic fluctuations, the potential changes by a factor of 9/10 from initial to final value, while for isocurvature fluctuations the potential evolves from zero to a final value of 58 Φ′ (0): Φad (late) = Φiso (late) =

9 10 Φ(0) 8 ′ 5 Φ (0)

= − 35 δc (0) =−

1 5 δc (0)

adiabatic ,

(23.61a)

isocurvature .

(23.61b)

The superhorizon solutions (23.53) for the dark matter overdensity δc and radiation monopole Θ0 are,  adiabatic , (23.62a) δc = 3Θ0 = 3 Φad − 32 Φ(0) δc = 3Φiso − 8Φ′ (0) ,

Θ0 = Φiso

isocurvature .

(23.62b)

23.10 Radiation-dominated, adiabatic initial conditions For adiabatic initial conditions, fluctuations that enter the horizon before matter-radiation equality, kηeq ≫ 1, are dominated by radiation. In the regime where radiation dominates both the unperturbed energy and its fluctuations, the relevant equations are, from equations (23.27), (23.28), and (23.30), ˙ 0 − kΘ1 = Φ ˙ , Θ a˙ − k 2 Φ − 3 F = 16πGa2 ρ¯r Θ0 , a −kF = 16πGa2 ρ¯r Θ1 ,

(23.63a) (23.63b) (23.63c)

in which, because it simplifies the mathematics, the Einstein momentum equation is used as a substitute for the radiation dipole equation. In the radiation-dominated epoch, the horizon is proportional to the cosmic scale factor, η ∝ a, equation (23.38). Inserting Θ0 and Θ1 from the Einstein energy and momentum equations (23.63b) and (23.63c) into the radiation monopole equation (23.63a) gives a second order differential equation for the potential Φ 2 ¨ + 4 Φ˙ + k Φ = 0 . Φ η 3

(23.64)

Θ0 − Φ and − 2 Φ

23.10 Radiation-dominated, adiabatic initial conditions 3.0 2.5 2.0 1.5 1.0 .5 .0 −.5 −1.0 −1.5 −2.0

377

adiabatic initial conditions large scales (kη s,eq > 1)

0

1

2

3

4 kη s / π

5

6

7

8

7

8

isocurvature initial conditions

Θ0 − Φ and − 2 Φ

1.0

large scales (kη s,eq > 1) −.5

0

1

2

3

4 kη s / π

5

6

Figure 23.3 The difference Θ0 − Φ between the radiation monopole and the Newtonian scalar potential oscillates about − 2Φ, in accordance with equation (23.45). The difference (Θ0 − Φ) − (−2Φ) = Θ0 + Φ, which is the temperature Θ0 redshifted by the potential Φ, is the monopole contribution to temperature fluctuation of the CMB. The top panel is for adiabatic initial conditions, equations (23.58), while the bottom panel is for isocurvature initial conditions, equation (23.59). The units of Φ and Θ0 are such that Φ(0) = −1 for adiabatic fluctuations, and δc (0) = 1 for isocurvature fluctuations. In each case, the thin lines show the evolution of small scale fluctuations, which enter the horizon during the radiation-dominated epoch well before matter-radiation equality, while the thick lines show the evolution of large-scale fluctuations, which enter the horizon during the matter-dominated epoch well after matter-radiation equality.

q Equation (23.64) describes damped sound waves moving at sound speed 13 times the speed of light. The √ sound horizon, the comoving distance that sound can travel, is η/ 3, the horizon distance η multiplied by

378

Cosmological perturbations: a simplest set of assumptions

the sound speed. The growing and decaying solutions to equation (23.64) are Φgrow =

3(sin α − α cos α) 3j1 (α) , = α α3

Φdecay = −

j−2 (α) cos α + α sin α , = α α3

(23.65)

√ where the dimensionless parameter α is the wavevector k multiplied by the sound horizon η/ 3, r kη k a 2 α≡ √ = , (23.66) 3 aeq Heq aeq 3 p and jl (α) ≡ π/(2α)Jl+1/2 (α) are spherical Bessel functions. The physically relevant solution that satisfies adiabatic initial conditions, remaining finite as α → 0, is the growing solution Φ = Φ(0) Φgrow .

(23.67)

The solution (23.65) shows that, after a mode enters the sound horizon the scalar potential Φ oscillates with an envelope that decays as α−2 . Physically, relativistically propagating sound waves tend to suppress the gravitational potential Φ. For the growing solution (23.67), the radiation monopole Θ0 is Θ0 = Φ(0)

  3  (1 − α2 ) sin α − α 1 − 12 α2 cos α . 3 α

(23.68)

The thin lines in the top panel of Figure 23.45 show the growing mode potential Φ and the radiation monopole Θ0 , equation (23.68). The Figure plots these two quantities in the form −2Φ and Θ0 − Φ, to bring out the fact that Θ0 − Φ oscillates about −2Φ, in accordance with equation (23.45). After a mode is well inside the sound horizon, α ≫ 1, the radiation monopole oscillates with constant amplitude, Θ0 =

3Φ(0) cos α 2

for α ≫ 1 .

(23.69)

The dark matter fluctuations are driven by the gravitational potential of the radiation. The solution of the dark matter equation (23.43) driven by the potential (23.67) and satisfying adiabatic initial conditions (23.62a) is   1 sin α δc = 3Φ − 9Φ(0) γ − + ln α − Ci α + , (23.70) 2 α where the potential Φ is the growing mode solution (23.67), γ ≡ 0.5772... is Euler’s constant, and Ci(α) ≡ Rα cos x dx/x is the cosine integral. Once the mode is well inside the sound horizon, α ≫ 1, the dark matter ∞ density δc , equation (23.70), evolves as   1 δc = − 9Φ(0) γ − + ln α for α ≫ 1 , (23.71) 2 which grows logarithmically. This logarithmic growth translates into a logarithmic increase in the amplitude of matter fluctuations at small scales, and is a characteristic signature of non-baryonic cold dark matter.

23.11 Radiation-dominated, isocurvature initial conditions

379

23.11 Radiation-dominated, isocurvature initial conditions For isocurvature initial conditions, the matter fluctuation contributes from the outset, |¯ ρc δc | > |¯ ρr Θ0 | even while radiation dominates the background density, ρ¯c ≪ ρ¯r . To develop an approximation adequate for isocurvature fluctuations entering the horizon well before matter-radiation equality, kηeq ≫ 1, regard the Einstein energy equation (23.28a) as giving the radiation monopole Θ0 , and the Einstein momentum equation (23.30) as giving the radiation dipole Θ1 . Insert these into the radiation monopole equation (23.27a), and eliminate the δ˙c terms using the dark matter density equation (23.26a). The result is, in units aeq = Heq = 1,  2k 2 a  Φ + δc = 0 , 2a(1 + a)Φ′′ + (8 + 9a)Φ′ + 2 1 + 3

(23.72)

where prime ′ denotes differentiation with respect to cosmic scale factor a. Equation (23.72) is valid in all regimes, for any combination of matter and radiation. For isocurvature initial conditions, the radiation monopole and potential vanish initially, Θ0 (0) = Φ(0) = 0, whereas the dark matter overdensity is finite, δc (0) 6= 0. For small scales that enter the horizon well before matter-radiation equality, kηeq ≫ 1, the potential Φ is small, while δc has some approximately constant non-zero value, up to and through the time when the mode enters the horizon, kη ≈ ka ≈ 1. In the radiation-dominated epoch, a ≪ 1, and with Φ ≈ 0 (but k large and ka ∼ 1, so k 2 aΦ is not small) equation (23.72) simplifies to 2aΦ′′ + 8Φ′ +

4k 2 a Φ + δc = 0 . 3

(23.73)

The solution of equation (23.73) for constant δc = δc (0) is, with α given by equation (23.66), δc (0) 1 + 12 α2 − cos α − α sin α . Φ=−p α3 2/3 k

(23.74)

The solution (23.51) for the radiation monopole Θ0 driven by the potential (23.74) is δc (0) (− 1 + α2 ) cos α + (− 1 + 12 α2 )(− 1 + α sin α) Θ0 = − p . α3 2/3 k

(23.75)

Equations (23.74) and (23.75) are the solution for small scale modes with isocurvature initial conditions that enter the horizon well before matter-radiation equality. After a mode is well inside the sound horizon, α ≫ 1, the radiation monopole (23.75) oscillates with constant amplitude, δc (0) Θ0 = − p sin α α ≫ 1 . 2 2/3 k

(23.76)

Whereas for adiabatic initial conditions the radiation monopole oscillated as cos α well inside the horizon, equation (23.69), for isocurvature initial conditions it oscillates as sin α well inside the horizon, equation (23.76).

380

Cosmological perturbations: a simplest set of assumptions

23.12 Subhorizon scales After a mode enters the horizon, the radiation fluctuation Θ0 oscillates, but the non-baryonic cold dark matter fluctuation δc grows monotonically. In due course, the dark matter density fluctuation ρ¯c δc dominates the radiation density fluctuation ρ¯r Θ0 , and this necessarily occurs before matter-radiation equality; that is, |¯ ρc δc | > |¯ ρr Θ0 | even though ρ¯c < ρ¯r . This is true for both adiabatic and isocurvature initial conditions; of course, for isocurvature initial conditions, the dark matter density fluctuation dominates from the outset. Even before the dark matter density fluctuation dominates, the cumulative contribution of the dark matter to the potential Φ begins to be more important than that of the radiation, because the potential sourced by the radiation oscillates, with an effect that tends to cancel when averaged over an oscillation. Regard the Einstein energy equation (23.28a) as giving the dark matter overdensity δc , and the Einstein momentum equation (23.30) as giving the dark matter velocity vc . Insert these into the dark matter density ˙ 0 terms using the radiation monopole equation (23.27a). The result equation (23.26a) and eliminate the Θ is, in units aeq = 1, 2a2 (1 + a)Φ′′ + a(6 + 7a)Φ′ − 2Φ − 4Θ0 = 0 ,

(23.77)

where prime ′ denotes differentiation with respect to cosmic scale factor a. Equation (23.77) is valid in all regimes, for any combination of matter and radiation. Once the mode is well inside the horizon, kη ≫ 1, the radiation monopole Θ0 oscillates about an average value of −Φ (since Θ0 − Φ oscillates about −2Φ, as noted in §23.7): hΘ0 i = − Φ .

(23.78)

Inserting this cycle-averaged value of Θ0 into equation (23.77) gives the Meszaros differential equation 2(1 + a)a2 Φ′′ + (6 + 7a)aΦ′ + 2Φ = 0 . The solutions of Meszaros’ differential equation (23.79) are  2 δc 3 aeq Heq , Φ=− 4 k a/aeq

(23.79)

(23.80)

where the dark matter overdensity δc is a linear combination δc = Cgrow δc,grow + Cdecay δc,decay of growing and decaying solutions, in units aeq = 1, √  √ 3 3   1 + a + 1 δc,grow = 1 + a , δc,decay = 1 + a ln √ −3 1+a . 2 2 1+a−1

(23.81)

(23.82)

For adiabatic initial conditions, the desired solution is the one that matches smoothly on to the the logarithmically growing solution for the dark matter overdensity δc given by equation (23.70). For modes that enter the horizon well before matter-radiation equality, the matching may be done in the radiation-dominated epoch a ≪ 1, where the growing and decaying modes (23.82) simplify to δc,grow = 1 ,

δdecay = − ln(a/4) − 3 ,

for a ≪ 1 .

(23.83)

23.13 Matter-dominated

381

Matching to the solution for δc well inside the horizon, equation (23.71), determines the constants " #  r2 k  7 Cgrow = − 9 Φ(0) γ − + ln 4 , Cdecay = 9 Φ(0) adiabatic . (23.84) 2 3 aeq Heq For isocurvature initial conditions, for modes that enter the horizon well before matter-radiation equality, only the growing mode is present, Cgrow = δc (0) ,

Cdecay = 0 isocurvature .

(23.85)

The dark matter overdensity δc then evolves as the linear combination (23.81) of growing and decaying modes (23.82). For modes that enter the horizon well before matter-radiation equality, the constants are set by equation (23.84) for adiabatic intial conditions, or equation (23.85) for isocurvature intial conditions. The solution remains valid from the radiation-dominated through into the matter-dominated epoch. At late times well into the matter-dominated epoch, a ≫ 1, the growing mode of the Meszaros solution dominates, δc,grow = 23 a ,

δc,decay =

4 −3/2 15 a

for a ≫ 1 ,

(23.86)

so that the dark matter overdensity δc at late times is δc =

3 2

Cgrow a

for a ≫ 1 .

(23.87)

The potential Φ, equation (23.80), at late times is constant, 9 Cgrow for a ≫ 1 . (23.88) 8k 2 For modes that enter the horizon well before matter-radiation equality, the radiation monopole Θ0 at late √ times a ≫ 1 is, with α ≡ kη/ 3, Φ=−

3 Θ0 = − Φ + Φ(0) cos α adiabatic , 2 δc (0) Θ0 = −Φ − p sin α isocurvature . 2 2/3 k

(23.89a) (23.89b)

23.13 Matter-dominated After matter-radiation equality, but before curvature or dark energy become important, non-relativistic matter dominates the mass-energy density of the Universe. In the matter-dominated epoch, the relevant equations are, from equations (23.26), (23.28a), and (23.30), ˙ , δ˙c − k vc = 3 Φ a˙ − k 2 Φ − 3 F = 4πGa2 ρ¯c δc , a −kF = 4πGa2 ρ¯c vc ,

(23.90a) (23.90b) (23.90c)

382

Cosmological perturbations: a simplest set of assumptions

in which, because it simplifies the mathematics, the Einstein momentum equation is used as a substitute for the matter velocity equation. In the matter-dominated epoch, the horizon is proportional to the square root of the cosmic scale factor, η ∝ a1/2 , equation (23.38). Inserting δc and vc from the Einstein energy and momentum equations (23.90b) and (23.90c) into the matter density equation (23.90a) yields a second order differential equation for the potential Φ ¨ + 6 Φ˙ = 0 . (23.91) Φ η The general solution of equation (23.91) is a linear combination Φ = Cgrow Φgrow + Cdecay Φdecay

(23.92)

of growing and decaying solutions Φgrow = 1 ,

Φdecay = α−5 ,

√ where the dimensionless paramater α is the wavevector k multiplied by the sound horizon η/ 3, r kη 2 ka1/2 α≡ √ =2 . 3 a3/2 3 eq Heq

(23.93)

(23.94)

The constants Cgrow and Cdecay in the solution (23.92) depend on conditions established before the matterdominated epoch. The corresponding growing and decaying modes for the dark matter overdensity δc are     α2 α2 Φgrow , δc,decay = 3 − Φdecay . (23.95) δc,grow = − 2 + 2 2 For modes well inside the horizon, α ≫ 1, the behaviour of the growing and decaying modes agrees with that (23.86) of the Meszaros solution, as it should. Any admixture of the decaying solution tends quickly to decay away, leaving the growing solution. The solution (23.51) for the radiation monopole Θ0 driven by the potential (23.92) is a sum of a homogeneous solution and a particular solution, Θ0 = B0 cos α + B1 sin α + Cgrow Θ0,grow + Cdecay Θ0,decay ,

(23.96)

with growing and decaying modes Θ0,grow = − Φgrow ,

Θ0,decay =

Φdecay  12 − 2α2 + α4 + α5 [cos α (Si α − π/2) − sin α Ci α] . 12

23.14 Recombination

Exercise 23.6

Electron-scattering mean free path.

(23.97)

23.15 Post-recombination 1. Define the neutron faction Xn by Xn ≡

nn , np + nn

383

(23.98)

where the proton and neutron number densities np and nn are taken to include protons and neutrons in all nuclei. For a H plus 4 He composition, the proton and neutron number densities are np = nH + 2n4 He ,

nn = 2n4 He .

(23.99)

Show that the primordial 4 He mass fraction defined by Y4 He ≡ ρ4 He /(ρH + ρ4 He ) satisfies Y4 He = 2Xn .

(23.100)

The observed primordial 4 He abundance is Y4 He = 0.24, implying Xn = 0.12 . 2. Define the ionization fraction Xe by Xe ≡

ne np

(23.101)

(23.102)

where again the proton number density np is taken to include protons in all nuclei, not just in hydrogen. Show that Xe (1 − Xn ) ne = (23.103) ρb mp where mp is the mass of a proton or neutron. 3. Show that the (dimensionless) ratio of the comoving electron-photon scattering mean free path lT to the comoving cosmological horizon distance c/(aeq Heq ) at matter-radiation equality is  2 aeq Heq lT Ωm aeq Heq 16πGmp a ≡ = c c¯ ne σT a 3cσT Xe (1 − Xn )Heq Ωb aeq  2  2 −1 28.94 h H0 Ω m a 0.002 a = = , (23.104) Xe (1 − Xn ) Heq Ωb aeq Xe aeq the Hubble parameter Heq at matter-radiation equality being related to the present-day Hubble parameter H0 by equation (23.41).

23.15 Post-recombination Recombination frees baryons and photons from each other’s grasp. Exercise 23.7

Growth of baryon fluctuations after recombination.

384

Cosmological perturbations: a simplest set of assumptions

23.16 Matter with dark energy Some time after recombination, dark energy becomes important. Observational evidence suggests that the dominant energy-momentum component of the Universe today is dark energy, with an equation of state consistent with that of a cosmological constant, pΛ = −ρΛ . In what follows, dark energy is taken to have constant density, and therefore to be synonymous with a cosmological constant. Since dark energy has a constant energy density whereas matter density declines as a−3 , dark energy becomes important only well after recombination. Dark energy does not cluster gravitationally, so the Einstein equations for the perturbed energy-momentum depend only on the matter fluctuation. However, dark energy does affect the evolution of the cosmic scale factor a. In fact, if matter is taken to be the only source of perturbation, then covariant energy-momentum conservation, as enforced by the Einstein equations, implies that the only addition that can be made to the unperturbed background is dark energy, with constant energy density. To see this, consider the equations governing the matter overdensity δm and scalar velocity vm (now subscripted m, since post-recombination matter includes baryons as well as non-baryonic cold dark matter), together with the Einstein energy and momentum equations: δ˙m − k vm a˙ v˙ m + vm a a˙ 2 −k Φ −3 F a −kF

˙ , = 3Φ

(23.105a)

= −kΦ ,

(23.105b)

= 4πGa2 ρ¯m δm ,

(23.105c)

= 4πGa2 ρ¯m vm .

(23.105d)

The factor 4πGa2 ρ¯m on the right hand side of the two Einstein equations can be written 4πGa2 ρ¯m =

3a30 H02 Ωm , 2a

(23.106)

where a0 and H0 are the present-day cosmic scale factor and Hubble parameter, and Ωm is the presentday matter density (a constant). Allow the Hubble parameter H(a) ≡ a/a ˙ 2 to be an arbitrary function of cosmic scale factor a. Inserting δm and velocity vm from the Einstein energy and momentum equations (23.105c) and (23.105d) into the matter equations (23.105a) and (23.105b), and taking the overdensity equation (23.105a) minus 3a/a ˙ times the velocity equation (23.105b), yields the condition a4

dH 2 + 3a30 H02 Ωm = 0 , da

(23.107)

Ωm H2 = + ΩΛ 2 H0 (a/a0 )3

(23.108)

whose solution is

for some constant ΩΛ . This shows that, as claimed, if only matter perturbations are present, then the unperturbed background can contain, besides matter, only dark energy with constant density ρ¯Λ = H02 ΩΛ /( 83 πG).

23.17 Matter with dark energy and curvature

385

The result is a consequence of the fact that the Einstein equations enforce covariant conservation of energymomentum. With the Hubble parameter given by equation (23.108), the matter and Einstein equations (23.105) yield a second order differential equation for the potential Φ, in units a0 = 1: 2a(Ωm + a3 ΩΛ )Φ′′ + (7Ωm + 10a3 ΩΛ )Φ′ + 6a2 ΩΛ Φ = 0 . The growing and decaying solutions to equation (23.109) are, in units a0 = 1, Z H 5Ωm H02 H(a) a da′ , Φdecay = . Φgrow = ′3 ′ )3 2 a a H(a a 0

(23.109)

(23.110)

The factor 25 Ωm H02 in the growing solution is chosen so that Φgrow → 1 as a → 0. The growing solution Φgrow can be expressed as an elliptic integral. The corresponding growing and decaying solutions for the matter overdensity δm are, again in units a0 = 1,     2k 2 a 2k 2 a δm,grow = 3 − Φgrow − 5 , δm,decay = 3 − Φdecay . (23.111) 3Ωm H02 3Ωm H02 For modes well inside the horizon, kη ∼ ka1/2 /H0 ≫ 1, the relation (23.111) agrees with that (23.117) below.

23.17 Matter with dark energy and curvature Curvature may also play a role after recombination. Observational evidence as of 2010 is consistent with the Universe having zero curvature, but it is possible that there may be some small curvature. If the curvature is non-zero, then strictly the unperturbed metric should be an FRW metric with curvature. However, a flat background FRW metric remains a good approximation for modes whose scales are small compared to the curvature, that is, for modes that are well inside the horizon, kη ≫ 1. For modes well inside the horizon, the time derivative of the potential can be neglected, Φ˙ = 0. With matter, curvature, and dark energy present, and for modes well inside the horizon, the equations go over to the Newtonian limit: δ˙m − k vm = 0 , a˙ v˙ m + vm = −kΦ , a − k 2 Φ = 4πGa2 ρ¯m δm .

(23.112a) (23.112b) (23.112c)

The factor 4πGa2 ρ¯m in the Einstein equation can be written as equation (23.106). The matter and Einstein equations (23.112) yield a second order equation for the matter overdensity δm , in units a0 = 1: 3Ωm H02 δm a˙ =0. δ¨m + δ˙m − a 2 a Equation (23.113) can be recast as a differential equation with respect to cosmic scale factor a:   ′ 3Ωm H02 δm 3 H ′ ′′ δm − =0, + δm + H a 2 a5 H 2

(23.113)

(23.114)

386

Cosmological perturbations: a simplest set of assumptions

where H ≡ a/a ˙ 2 is the Hubble parameter, and prime ′ denotes differentiation with respect to a. In the case of matter plus curvature plus dark energy, the Hubble parameter H satisfies, again in units a0 = 1, H2 = Ωm a−3 + Ωk a−2 + ΩΛ , H02

(23.115)

where Ωm , Ωk , and ΩΛ are the (constant) present-day values of the matter, curvature, and dark energy densities. The growing and decaying solutions to equation (23.114) are Z a 5Ωm H02 H da′ δm,grow ≡ a g(a) = , δm,decay = . (23.116) H(a) ′3 ′ 3 2 H0 0 a H(a ) The potential Φ is related to the matter overdensity δm by, again in units a0 = 1, equation (23.112c), 3Ωm H02 δm . (23.117) 2k 2 a The observationally relevant solution is the growing mode. The growing mode is conventionally given a special notation, the growth factor g(a), because of its importance to relating the amplitude of clustering at various times, from recombination up to the present. For the growing mode, Φ=−

δ ∝ a g(a) ,

Φ ∝ g(a) .

(23.118)

The normalization factor 25 Ωm H02 in equation (23.116) is chosen so that in the matter-dominated phase shortly after recombination (small a), the growth factor g(a) is g(a) = 1 .

(23.119)

Thus as long as the Universe remains matter-dominated, the potential Φ remains constant. Curvature or dark energy causes the potential Φ to decrease. It should be emphasized that the growing and decaying solutions (23.116) are valid only for the case of matter plus curvature plus constant density dark energy, where the Hubble parameter takes the form (23.115). If another kind of mass-energy is considered, such as dark energy with non-constant density, then equations governing perturbations of the other kind must be adjoined, and the Einstein equations modified accordingly. The growth factor g(a) may expressed analytically as an elliptic function. A good analytic approximation is (Carroll, Press & Turner 1992, Ann. Rev. Astron. Astrophys. 30, 449) g≈

5Ωm h  4/7 2 Ωm − ΩΛ + 1 + 12 Ωm 1 +

1 70 ΩΛ

i ,

where Ωa are densities at the epoch being considered (such as the present, a = a0 ).

(23.120)

24 ∗

Cosmological perturbations: a more careful treatment of photons and baryons

The “simple” treatment of cosmological perturbations in the previous Chapter is sufficient to reveal that the photon-baryon fluid at the time of recombination shows a characteristic pattern of oscillations. Translating this pattern into something that can be compared to observations of the CMB requires a more careful treatment that follows the evolution of photons using a collisional Boltzmann equation. Cosmologists conventionally refer to atomic matter — anything that acts by either strong or electromagnetic forces — as baryons (from the greek baryos, meaning heavy), even though they mean by that not only baryons (protons, neutrons, and other nuclei), but also electrons, which are leptons (from the greek leptos, meaning light). The designation baryons does not include relativistic species, such as photons and neutrinos, nor does it include non-baryonic dark matter or dark energy. The designation “baryons” is nonsensical, but has stuck. Although baryons are gravitationally sub-dominant, having a mass density about 1/5 that of the nonbaryonic dark matter, they play an important role in CMB fluctuations. Most importantly, before recombination atomic matter is ionized, and the free electrons scatter photons, preventing photons from travelling far. Recombination occurs when the temperature drops to the point that electrons combine into neutral atoms, releasing the photons to travel freely across the Universe, into astronomers’ telescopes. Electron-photon scattering keeps photons and baryons tightly coupled, so that up to the time of recombination they oscillate together as a photon-baryon fluid. As recombination approaches, the baryon density becomes increasingly important relative to the photon density. The baryons, which provide mass but no pressure, decrease the sound speed of the photon-baryon fluid, and their gravity enhances sound wave compressions while weakening rarefactions. As recombination approaches, the mean free path of photons to scattering increases, which tends to damp sound waves at short scales. All these effects — a decreased sound speed, an enhancement of compression over rarefaction, and damping at small scales — produce observable signatures in the power spectrum of temperature fluctuations in the CMB.

388



Cosmological perturbations: a more careful treatment of photons and baryons

24.1 Lorentz-invariant spatial and momentum volume elements DO THIS BETTER. To define an occupation number in a Lorentz-invariant fashion, it is first necessary to define Lorentz-invariant volume elements of space and momentum. With respect to an orthonormal tetrad, volume elements transform as they do in special relativity. With respect to an orthonormal tetrad, a Lorentz-invariant spatial 3-volume element can be constructed as d4x dt 3 = d x = E d3x . dτ dτ

(24.1)

Lorentz invariance of Ed3x is evident because the tetrad-frame 4-volume element d4x is a scalar, and likewise the interval dτ of proper time is a scalar. Similarly, with respect to an orthonormal tetrad, a Lorentz-invariant momentum-space 3-volume element can be constructed as Z Z d3p . (24.2) 2 δD (E 2 − p2 − m2 ) dE d3p = 2 δD (E 2 − p2 − m2 ) d4p = E E>0 E>0 Lorentz-invariance of d3p/E is evident because d4p is a Lorentz-invariant momentum 4-volume, and the argument E 2 − p2 − m2 of the delta-function is a scalar. From the Lorentz invariance of Ed3x and d3p/E it follows that the product d3x d3p

(24.3)

of spatial and momentum 3-volumes is Lorentz-invariant.

24.2 Occupation numbers WARNING: d3x IS TETRAD-FRAME VOLUME, BUT x IS A COORDINATE. Each species of energymomentum is described by a dimensionless occupation number, or phase-space distribution, a function f (η, x, p) of conformal time η, comoving position x, and tetrad-frame momentum p, which describes the number dN of particles in a tetrad-frame element d3x d3p/(2π~)3 of phase-space dN (η, x, p) = f (η, x, p)

g d3x d3p , (2π~)3

(24.4)

with g being the number of spin states of the particle. The tetrad-frame phase-space element d3x d3p/(2π~)3 is dimensionless and Lorentz invariant, and the occupation number f is likewise dimensionless and Lorentz invariant. The tetrad-frame energy-momentum 4-vector pm of a particle is dxµ = {E, p} = {E, pi } , (24.5) dλ where λ is the affine parameter, related to proper time τ along the worldline of the particle by dλ ≡ dτ /m, which remains well-defined in the limit of massless particles, m = 0. The tetrad-frame energy E and momentum p ≡ |p| for a particle of rest mass m are related by pm ≡ e m µ

E 2 − p 2 = m2 .

(24.6)

24.3 Occupation numbers in thermodynamic equilibrium

389

The tetrad-frame components of the energy-momentum tensor T mn of any species are integrals over its occupation number f weighted by the product pm pn of 4-momenta: Z g d3p . (24.7) T mn = f pm pn E(2π~)3

The energy-momentum tensor T mn defined by equation (24.7) is manifestly a tetrad-frame tensor, thanks to the Lorentz-invariance of the momentum-space 3-volume element d3p/E.

24.3 Occupation numbers in thermodynamic equilibrium Frequent collisions tend to drive a system towards thermodynamic equilibrium. Electron-photon scattering keeps photons in near equilibrium with electrons, while Coulomb scattering keeps electrons in near equilibrium with ions, primarily hydrogen ions (protons) and helium nuclei. Thus photons and baryons can be treated as having unperturbed distributions in mutual thermodynamic equilibrium, and perturbed distributions that are small departures from thermodynamic equilibrium. In thermodynamic equilibrium at temperature T , the occupation numbers of fermions, which obey an exclusion principle, and of bosons, which obey an anti-exclusion principle, are  1   fermion , (E−µ)/T e +1 (24.8) f= 1   boson , e(E−µ)/T − 1 where µ is the chemical potential of the species. In the limit of small occupation numbers, f ≪ 1, equivalent to large negative chemical potential, µ → −large, both fermion and boson distributions go over to the Boltzmann distribution f = e(−E+µ)/T

Boltzmann .

(24.9)

Chemical potential is the thermodynamic potential assocatiated with conservation of number. There is a distinct potential for each conserved species. For example, photoionization and radiative recombination of hydrogen, H+γ ↔p+e ,

(24.10)

separately preserves proton and electron number, hydrogen being composed of one proton and one electron. In thermodynamic equilibrium, the chemical potential µH of hydrogen is the sum of the chemical potentials µp and µe of protons and electrons µH = µp + µe .

(24.11)

Photon number is not conserved, so photons have zero chemical potential, µγ = 0 , which is closely associated with the fact that photons are their own antiparticles.

(24.12)

390



Cosmological perturbations: a more careful treatment of photons and baryons

24.4 Boltzmann equation The evolution of each species is described by the general relativistic Boltzmann equation df = C[f ] , dλ

(24.13)

where C[f ] is a collision term. The derivative with respect to affine parameter λ on the left hand side of the Boltzmann equation (24.13) is a Lagrangian derivative along the (timelike or lightlike) worldline of a particle in the fluid. Since both the occupation number f and the affine parameter λ are Lorentz scalars, the collision term C[f ] is a Lorentz scalar. In the absence of collisions, the collisionless Boltzmann equation df /dλ = 0 expresses conservation of particle number: a particle is neither created nor destroyed as it moves along its wordline. The left hand side of the Boltzmann equation (24.13) is df dpˆ ∂f dpi ∂f dp ∂f = E∂0 f + pi ∂i f + = pm ∂m f + · + . dλ dλ ∂pi dλ ∂ pˆ dλ ∂p

(24.14)

Both dp/dλ ˆ and ∂f /∂ pˆ vanish in the unperturbed background, so dp/dλ ˆ · ∂f /∂ pˆ is of second order, and can be neglected to linear order, so that dp ∂f df = E∂0 f + pi ∂i f + . dλ dλ ∂p

(24.15)

The expression (24.15) for the left hand side df /dλ of the Boltzmann equation involves dp/dλ, which in free-fall is determined by the usual geodesic equation dpk + Γkmn pm pn = 0 . dλ

(24.16)

Since E 2 −p2 = m2 , it follows that the equation of motion for the magnitude p of the tetrad-frame momentum is related to the equation of motion for the tetrad-frame energy E by p

dp dE =E . dλ dλ

(24.17)

The equation of motion for the tetrad-frame energy E ≡ p0 is dE = − Γ0mn pm pn = Γ0i0 pi E + Γ0ij pi pj . dλ

(24.18)

From this it follows that d ln p E dE = 2 =E dλ p dλ



E pˆi Γ0i0 + pˆi pˆj Γ0ij p



  1 a˙ E pˆi i j =E − 2 + Γ0i0 + pˆ pˆ Γ0ij , a p

(24.19)

where in the last expression the tetrad connection Γ0ij , equation (22.14b), has been separated into its 1

unperturbed and perturbed parts −(a/a ˙ 2 )δij and Γ0ij .

24.4 Boltzmann equation

391

In practice, the integration variable used to evolve equations is the conformal time η, not the affine parameter λ. The relation between conformal time η and affine parameter λ is  1 dη 0 n + ϕm n )en η pm = (24.20) E(1 − ϕ00 ) − pi ϕi0 , = pη = em η pm = (δm dλ a whose reciprocal is to linear order   dλ pi a (24.21) 1 + ϕ00 + ϕi0 . = dη E E With conformal time η as the integration variable, the equation of motion (24.19) for the magnitude p of the tetrad-frame momentum becomes, to linear order,   1 pi E pˆi d ln p a˙ 1 + ϕ00 + ϕi0 + =− a Γ0i0 + pˆi pˆj a Γ0ij . (24.22) dη a E p With respect to conformal time η, the Boltzmann equation (24.13) is ∂f d ln p ∂f dλ df = + v i ∇i f + = C[f ] , dη ∂η dη ∂ ln p dη

(24.23)

with dλ/dη from equation (24.21), and d ln p/dη from equation (24.22). Expressions for dλ/dη and d ln p/dη in terms of the vierbein perturbations in a general gauge are left as Exercise 24.1. In conformal Newtonian gauge, the factor dλ/dη, equation (24.21), is dλ a = (1 + Ψ) . dη E

(24.24)

In conformal Newtonian gauge, and including only scalar fluctuations, the factor d ln p/dη, equation (24.22), is i a˙ d ln p ˙ − E pˆ ∇i Ψ . =− +Φ (24.25) dη a p To unperturbed order, the Boltzmann equation (24.23) is 0

0

0

0 df ∂f a˙ ∂ f a = − = C[f ] , dη ∂η a ∂ ln p E

(24.26)

0

where C[f ] is the unperturbed collision term, the factor a/E coming from dλ/dη = a/E to unperturbed order, equation (24.21). The second term in the middle expression of equation (24.26) simply reflects the fact that the tetrad-frame momentum p redshifts as p ∝ 1/a as the Universe expands, a statement that is true for both massive and massless particles. Subtracting off the unperturbed part (24.26) of the Boltzmann equation (24.23) gives the perturbation of the Boltzmann equation 1

1

1

0

1

1 1 ∂f d ln(ap) ∂ f a dλ 0 a˙ ∂ f df = + v i ∇i f − + = C[f ] + C[f ] . dη ∂η a ∂ ln p dη ∂ ln p E dη

(24.27)

392



Cosmological perturbations: a more careful treatment of photons and baryons

In conformal Newtonian gauge, the perturbed part of dλ/dη is 1

a dλ = Ψ. dη E

(24.28)

In conformal Newtonian gauge, and including only scalar fluctuations, d ln(ap)/dη is d ln(ap) E pˆi = Φ˙ − ∇i Ψ . dη p

(24.29)

Exercise 24.1 Boltzmann equation factors in a general gauge. Show that in a general gauge, and including not just scalar but also vector and tensor fluctuations, equation (24.21) is   dλ pi a 1 + ψ + (∇i w = ˜+w ˜i ) , (24.30) dη E E while equation (24.22) is, with only scalar fluctuations included,     ∂ a˙ E pˆi a˙ m2 d ln p ˙ − ∇i ψ + (∇i w ˜+w ˜i ) =− +φ+ + dη a p ∂η a E 2 i h ˙ ˙ − 1 (∇i Wj + ∇j Wi ) + ∇j w . ˜ + h + pˆi pˆj − ∇i ∇j (w − h) i ij 2

(24.31)

24.5 Non-baryonic cold dark matter Non-baryonic cold dark matter, subscripted c, is by assumption non-relativistic and collisionless. The unperturbed mean density is ρ¯c , which evolves with cosmic scale factor a as ρ¯c ∝ a−3 .

(24.32)

Since dark matter particles are non-relativistic, the energy of a dark matter particle is its rest-mass energy, Ec = mc , and its momentum is the non-relativistic momentum pic = mc vci . The energy-momentum tensor Tcmn of the dark matter is obtained from integrals over the dark matter phase-space distribution fc , equation (24.7). The energy and momentum moments of the distribution define the dark matter overdensity δc and bulk velocity v c , while the pressure is of order vc2 , and can be neglected to linear order, Z gc d3pc Tc00 ≡ fc mc ≡ ρ¯c (1 + δc ) , (24.33a) (2π~)3 Z gc d3pc Tc0i ≡ fc mc vci ≡ ρ¯c vci , (24.33b) (2π~)3 Z gc d3pc Tcij ≡ fc mc vci vcj =0. (24.33c) (2π~)3

24.5 Non-baryonic cold dark matter

393

Non-baryonic cold dark matter is collisionless, so the collision term in the Boltzmann equation is zero, C[fc ] = 0, and the dark matter satisfies the collisionless Boltzmann equation dfc =0. dη

(24.34)

The energy and momentum moments of the Boltzmann equation (24.23) yield equations for the overdensity δc and bulk velocity vc , which in the conformal Newtonian gauge are  Z Z  Z Z 3 dfc a˙ gc d3pc ∂ gc d3pc gc d3pc ∂f i g c d pc ˙ 0= = + ∇ − mc − Φ mc f m v f m i c c c c c 3 3 3 dη (2π~) ∂η (2π~) (2π~) a ∂ ln p (2π~)3   a˙ ∂ ρ¯c (1 + δc ) + ∇i (¯ ρc vci ) + 3 − Φ˙ ρ¯c , (24.35a) = ∂η a Z Z Z 3 dfc ∂ gc d3pc gc d3pc i g c d pc = + ∇ 0= mc vci fc mc vci vcj f m v j c c c 3 3 dη (2π~) ∂η (2π~) (2π~)3  Z  a˙ E pˆj gc d3pc ∂f ˙ − −Φ+ ∇j Ψ mc v i a p ∂ ln p (2π~)3   i a˙ ∂ ρ¯c vc +4 − Φ˙ ρ¯c vci + ρ¯c ∇i Ψ . (24.35b) = ∂η a ˙ ρc v i term on the last line of equation (24.35b) can be dropped, since the potential Φ and the velocity The Φ¯ c i vc are both of first order, so their product is of second order. Subtracting the unperturbed part from equations (24.35a) and (24.35b) gives equations for the dark matter overdensity δc and velocity vc , ˙ =0, δ˙c + ∇ · vc − 3Φ (24.36a) a˙ v˙ c + v c + ∇Ψ = 0 . (24.36b) a Transform into Fourier space, and decompose the velocity 3-vector vc into scalar vc and vector v c,⊥ parts ˆ c + v c,⊥ . v c = −ikv

(24.37)

For the scalar modes under consideration, only the scalar part of the dark matter equations (24.36) is relevant: ˙ =0, δ˙c − kvc − 3Φ (24.38a) a˙ (24.38b) v˙ c + vc + kΨ = 0 . a Equations (24.38) reproduce the equations (23.26) derived previously from conservation of energy and momentum. Exercise 24.2 Moments of the non-baryonic cold dark matter Boltzmann equation. Confirm equations (24.35).

394



Cosmological perturbations: a more careful treatment of photons and baryons

24.6 The left hand side of the Boltzmann equation for photons In the unperturbed background, the photons have a blackbody distribution with temperature T (η). Define Θ to be the photon temperature fluctuation Θ(η, x, pγ ) ≡

δT (η, x, pγ ) . T (η)

(24.39)

In the unperturbed background, the photon occupation number is 0

fγ =

1 . epγ /T − 1

(24.40)

Since pγ ∝ T ∝ 1/a, the unperturbed occupation number is constant as a function of pγ /T . The definition Θ ≡ δT /T = δ ln T of the photon perturbation is to be interpreted as meaning that the perturbation to the occupation number of photons is (the partial derivative with respect to temperature ∂/∂ ln T is at constant photon momentum pγ ) 0

0

∂ fγ ∂ fγ δ ln T = Θ, fγ = ∂ ln T ∂ ln T 1

(24.41)

in which it follows from equation (24.40) that 0 0 0 ∂ fγ pγ = fγ (1 + fγ ) . (24.42) ∂ ln T T The photon Boltzmann equation in terms of the occupation number fγ can be recast as a Boltzmann equation for the temperature fluctuation Θ through  0  0   0 0 1 ∂ fγ dΘ ∂ fγ dΘ dfγ d  ∂ fγ  d  ∂ fγ  Θ+ = Θ = = , (24.43) dη dη ∂ ln T dη ∂ ln T ∂ ln T dη ∂ ln T dη 0

in which the first term of the penultimate expression vanishes because ∂ fγ /∂ ln T is a function of pγ /T only, and pγ /T is, to unperturbed order (which is all that is needed since the term is multiplied by Θ, which is already of first order), independent of time, d(pγ /T )/dη = 0, since pγ ∝ T ∝ a−1 : 0

0

d ∂ fγ d ln(pγ /T ) ∂ 2 fγ =0. =− dη ∂ ln T dη ∂ ln T 2 In terms of the temperature fluctuation Θ, the perturbed photon Boltzmann equation (24.27) is !, 0 1 1 ∂ fγ d ln(apγ ) dλ 0 ∂Θ a˙ ∂Θ dΘ a i − C[fγ ] + = + pˆγ ∇i Θ − = C[fγ ] , dη ∂η a ∂ ln pγ dη pγ dη ∂ ln T 0

(24.44)

(24.45)

0

where the d ln(apγ )/dη term gets a minus sign from ∂ f /∂ ln pγ = −∂ f /∂ ln T . The unperturbed photon distribution is in thermodynamic equilibrium, so the unperturbed collision term

24.7 Spherical harmonics of the photon distribution

395

0

in the photon Boltzmann equation (24.45) vanishes, C[fγ ] = 0, as found in Exercise 24.3 below. The photon distribution is modified by photon-electron (Thomson) scattering, §24.10. Since the electrons are non-relativistic, to linear order collisions change the photon momentum but not the photon energy. As a consequence, the temperature fluctuation is a function Θ(η, x, p ˆγ ) only of the direction pˆγ of the photon momentum pγ , not of its magnitude, the energy pγ . This is shown more carefully below, equation (24.78). Hence the derivative ∂Θ/∂ ln pγ in the photon Boltzmann equation (24.45) vanishes to linear order. Thus the photon Boltzmann equation (24.45) reduces to 0

. ∂f 1 dΘ ∂Θ d ln(apγ ) a γ C[fγ ] = + pˆiγ ∇i Θ − = . dη ∂η dη pγ ∂ ln T

(24.46)

In conformal Newtonian gauge, and including only scalar fluctuations, d ln(apγ )/dη is given by equation (24.29), so the photon Boltzmann equation (24.45) becomes 0

. ∂f 1 ∂Θ a dΘ γ C[fγ ] = + pˆiγ ∇i Θ − Φ˙ + pˆiγ ∇i Ψ = . dη ∂η pγ ∂ ln T

(24.47)

24.7 Spherical harmonics of the photon distribution It is natural to expand the temperature fluctuation Θ in spherical harmonics. As seen below, equations (24.55), the various components of the photon energy-momentum tensor Tγmn are determined by the monopole, dipole, and quadrupole harmonics of the photon distribution. Scalar fluctuations are those that ˆ which correspond to spherical harmonics with are rotationally symmetric about the wavevector direction k, zero azimuthal quantum number, m = 0. Expanded in spherical harmonics, and with only scalar terms retained, the temperature fluctuation Θ can be written Θ(η, k, pˆγ ) =

∞ X ℓ=0

ˆ · pˆγ ) , (−i)ℓ (2ℓ + 1)Θℓ (η, k)Pℓ (k

(24.48)

where Pℓ are Legendre polynomials, §24.21. The scalar harmonics Θℓ are angular integrals of the temperature fluctuation Θ over photon directions pˆγ : Θℓ (η, k) = iℓ

Z

ˆ · pˆγ ) Θ(η, k, pˆγ )Pℓ (k

dopγ . 4π

(24.49)

Expanded into the scalar harmonics Θℓ (η, k), the left hand side of the photon Boltzmann equation (24.46)



396

Cosmological perturbations: a more careful treatment of photons and baryons

is, in conformal Newtonian gauge, dΘ0 ˙ 0 − kΘ1 − Φ˙ , =Θ dη  dΘ1 ˙ 1 + k Θ0 − 2k 2 Θ2 + k Ψ , =Θ dη 3 3 dΘℓ ˙ ℓ + k [ℓΘℓ−1 − (ℓ + 1)Θℓ+1 ] =Θ dη 2ℓ + 1

(24.50) (24.51) (ℓ ≥ 2) .

(24.52)

24.8 Energy-momentum tensor for photons Perturbations to the photon energy-momentum tensor involve integrals (24.7) over the perturbed occupation number of the form, where F (p) ˆ is some arbitrary function of the momentum direction p, ˆ 0

Z

1

fγ p2γ

2 d3pγ F (pˆγ ) = pγ (2π~)3

Z

∂ fγ 2 2 4πp2γ dpγ p ∂ ln T γ pγ (2π~)3

Z

Θ F (pˆγ )

Z

fγ pγ

dopγ = 4ρ¯γ 4π

Z

Θ F (pˆγ )

dopγ , 4π

(24.53)

in which the last expression is true because 0

Z

∂ fγ 2 2 4πp2γ dpγ =4 p ∂ ln T γ pγ (2π~)3

0

0

2 4πp2γ dpγ = 4ρ¯γ , (2π~)3

(24.54)

0

which follows from ∂ fγ /∂ ln T = − ∂ fγ /∂ ln pγ and an integration by parts. The perturbation of the photon energy density, energy flux, monopole pressure, and quadrupole pressure are 1

T 00 ¯γ Θ0 , γ = 4ρ

(24.55a)

1

kˆi T 0i ¯γ Θ1 , γ = i4ρ 4 δij T ij ¯γ Θ0 , γ = 3 ρ 1 ij 1 3 ˆ ˆ ¯γ Θ2 . 2 ki kj − 2 δij T γ = − 4 ρ 1 3



(24.55b)

1

(24.55c) (24.55d)

24.9 Collisions For a 2-body collision of the form 1+2 ↔ 3+4 ,

(24.56)

24.9 Collisions

397

the rate per unit time and volume at which particles of type 1 leave and enter an interval d3p1 of momentum space is, in units c = 1, Z   g1 d3p1 = |M|2 − f1 f2 (1 ∓ f3 )(1 ∓ f4 ) + f3 f4 (1 ∓ f1 )(1 ∓ f2 ) C[f1 ] 3 E1 (2π~) g2 d3p2 g3 d3p3 g4 d3p4 g1 d3p1 4 . (24.57) (2π~)4 δD (p1 + p2 + p3 + p4 ) 2E1 (2π~)3 2E2 (2π~)3 2E3 (2π~)3 2E4 (2π~)3 All factors in equation (24.57) are Lorentz scalars. On the left hand side, the collision term C[f1 ] and the momentum 3-volume element d3p1 /E1 are both Lorentz scalars. On the right hand side, the squared amplitude |M|2 , the various occupation numbers fa , the energy-momentum conserving 4-dimensional Dirac 4 delta-function δD (p1 + p2 + p3 + p4 ), and each of the four momentum 3-volume elements d3pa /Ea , are all Lorentz scalars. The first ingredient in the integrand on the right hand side of the expression (24.57) is the Lorentz-invariant scattering amplitude squared |M|2 , calculated using quantum field theory (F. Halzen & A. D. Martin 1984 Quarks and Leptons, Wiley, New York, p. 91). The second ingredient in the integrand on the right hand side of expression (24.57) is the combination of rate factors rate(1 + 2 → 3 + 4) ∝ f1 f2 (1 ∓ f3 )(1 ∓ f4 ) ,

rate(1 + 2 ← 3 + 4) ∝ f3 f4 (1 ∓ f1 )(1 ∓ f1 ) ,

(24.58a) (24.58b)

where the 1 ∓ f factors are blocking or stimulation factors, the choice of ∓ sign depending on whether the species in question is fermionic or bosonic: 1 − f = Fermi-Dirac blocking factor ,

1 + f = Bose-Einstein stimulation factor .

(24.59a) (24.59b)

The first rate factor (24.58a) expresses the fact that the rate to lose particles from 1 + 2 → 3 + 4 collisions is proportional to the occupancy f1 f2 of the initial states, modulated by the blocking/stimulation factors (1 ∓ f3 )(1 ∓ f4 ) of the final states. Likewise the second rate factor (24.58b) expresses the fact that the rate to gain particles from 1 + 2 ← 3 + 4 collisions is proportional to the occupancy f3 f4 of the initial states, modulated by the blocking/stimulation factors (1 ∓ f1 )(1 ∓ f2 ) of the final states. In thermodynamic equilibrium, the rates (24.58) balance, Exercise 24.3, a property that is called detailed balance, or microscopic reversibility. Microscopic reversibility is a consequence of time reversal symmetry. The final ingredient in the integrand on the right hand side of expression (24.57) is the 4-dimensional Dirac delta-function, which imposes energy-momentum conservation on the process 1 + 2 ↔ 3 + 4. The 4-dimensional delta-function is a product of a 1-dimensional delta-function expressing energy conservation, and a 3-dimensional delta-function expressing momentum conservation: 4 3 (2π~)4 δD (p1 + p2 + p3 + p4 ) = 2π~ δD (E1 + E2 + E3 + E4 ) (2π~)3 δD (p1 + p2 + p3 + p4 ) .

(24.60)

398



Exercise 24.3

Cosmological perturbations: a more careful treatment of photons and baryons

Detailed balance. Show that the rates balance in thermodynamic equilibrium, f1 f2 (1 ∓ f3 )(1 ∓ f4 ) = f3 f4 (1 ∓ f1 )(1 ∓ f2 ) .

(24.61)

Solution. Equation (24.61) is true if and only if f1 f2 f3 f4 = . 1 ∓ f1 1 ∓ f2 1 ∓ f3 1 ∓ f4

(24.62)

f = e(−E+µ)/T , 1∓f

(24.63)

−E2 + µ2 −E3 + µ3 −E4 + µ4 −E1 + µ1 + = + , T T T T

(24.64)

But

so (24.62) is true if and only if

which is true in thermodynamic equilibrium because E1 + E2 = E3 + E4 ,

µ1 + µ2 = µ3 + µ4 .

(24.65)

24.10 Electron-photon scattering The dominant process that couples photons and baryons is electron-photon scattering e + γ ↔ e′ + γ ′ .

(24.66)

The Lorentz-invariant transition probability for unpolarized non-relativistic electron-photon (Thomson) scattering is     (24.67) |M|2 = 12πm2e σT ~2 1 + (pˆγ · pˆγ ′ )2 = 16πm2e σT ~2 1 + 21 P2 (pˆγ · pˆγ ′ ) ,

where P2 (µ) is the quadrupole Legendre polynomial, §24.21, and 8π σT = 3 is the Thomson cross-section.



e2 m e c2

2

(24.68)

24.11 The photon collision term for electron-photon scattering

399

24.11 The photon collision term for electron-photon scattering Electron-photon scattering keeps electrons and photons close to mutual thermodynamic equilibrium, and their unperturbed distributions can be taken to be in thermodynamic equilibrium. The unperturbed photon collision term for electron-photon scattering therefore vanishes, because of detailed balance, Exercise 24.3, 0

C[fγ ] = 0 .

(24.69)

Thanks to detailed balance, the combination of rates in the collision integral (24.57) almost cancels, so can be treated as being of linear order in perturbation theory. This allows other factors in the collision integral to be approximated by their unperturbed values. The photon collision term for electron-photon scattering follows from the general expression (24.57). To unperturbed order, the energies of the electrons, which are non-relativistic, may be set equal to their rest masses, Ee = me . Since photons are massless, their energies are just equal to their momenta, Eγ = pγ . The electron occupation number is small, fe ≪ 1, so the Fermi blocking factors for electrons may be neglected, 1 − fe = 0. The number of spins of the incoming electron is two, ge = 2, because photons scatter off both spins of electrons. On the other hand, the number of spins of the scattered electron and photon are one, ge′ = gγ ′ = 1, because non-relativistic electron-photon scattering leaves the spins of the electron and photon unchanged. These considerations bring the photon collision term for electron-photon scattering to, from the general expression (24.57), Z 1   d3pe′ d3pγ ′ 2 d3pe 1 4 . (pe +pγ −pe′ −pγ ′ ) |M|2 − fe fγ (1+fγ ′ )+fe′ fγ ′ (1+fγ ) (2π~)4 δD C[fγ ] = 16 me (2π~)3 me (2π~)3 pγ ′ (2π~)3 (24.70) The various integrations over momenta are most conveniently carried out as follows. The energy-conserving integral is best done over the energy of the scattered photon γ ′ , which is scattered into an interval doγ ′ of solid angle: Z doγ ′ doγ ′ d3pγ ′ = pγ ′ ≈ pγ . (24.71) 2π~ δD (Ee + Eγ − Ee′ − Eγ ′ ) 3 2 Eγ ′ (2π~) (2π~) (2π~)2 The approximation in the last step of equation (24.71), replacing the energy pγ ′ of the scattered photon by the energy pγ of the incoming photon, is valid because, thanks to the smallness of the combination of rates in the collision integral (24.70), it suffices to treat the photon energy to unperturbed order. As seen below, equation (24.76) the energy difference pγ − pγ ′ between the incoming and scattered photons is of linear order in electron velocities. The momentum-conserving integral is best done over the momentum of the scattered electron, which is e′ for outgoing scatterings e + γ → e′ + γ ′ , and e for incoming scatterings e + γ ← e′ + γ ′ . In the former case (e + γ → e′ + γ ′ ), Z 1 d3pe′ 3 = , (24.72) (2π~)3 δD (pe + pγ − pe′ − pγ ′ ) Ee′ (2π~)3 me and the result is the same, 1/me , in the latter case (e + γ ← e′ + γ ′ ). The energy- and momentumconserving integrals having been done, the electron e′ in the latter case may be relabelled e. So relabelled,



400

Cosmological perturbations: a more careful treatment of photons and baryons

the combination of rate factors in the collision integral (24.70) becomes − fe fγ (1 + fγ ′ ) + fe fγ ′ (1 + fγ ) = fe (− fγ + fγ ′ ) .

(24.73)

Notice that the stimulated terms cancel. The energy- and momentum-conserving integrations (24.71) and (24.72) bring the photon collision term (24.70) to Z 1 pγ 2 d3pe doγ ′ 2 ′) . (24.74) C[fγ ] = |M| f (− f + f e γ γ 16πm2e ~2 (2π~)3 4π

The collision integral (24.74) involves the difference − fγ + fγ ′ between the occupancy of the initial and final photon states. To linear order, the difference is 0

∂ fγ = − f (pγ ) + f (pγ ′ ) − f (pγ ) + f (pγ ′ ) = ∂ ln T 0

− fγ + fγ ′

1

0

1



 pγ − pγ ′ − Θ(pγ ) + Θ(pγ ′ ) . pγ

(24.75)

The first term (pγ − pγ ′ )/pγ arises because the incoming and scattered photon energies differ slightly. The difference in photon energies is given by energy conservation: pγ − pγ ′ = Ee′ − Ee     p2 p′2 − me + e = me + e 2me 2me 2 2 (pe + pγ − pγ ′ ) − pe = 2me (pγ − pγ ′ ) · (2pe + pγ − pγ ′ ) = 2me ≈ (pγ − pγ ′ ) · ve ,

(24.76)

the last line of which follows because the photon momentum is small compared to the electron momentum, pγ ∼ T ∼ me ve2 = pe ve ≪ pe .

(24.77)

Because the photon energy difference is of first order, and the temperature fluctuation is already of first order, it suffices to regard the temperature fluctuation Θ as being a function only of the direction pˆγ of the photon momentum, not of its energy: Θ(pγ ) ≈ Θ(pˆγ ) .

(24.78)

The linear approximations (24.76) and (24.78) bring the difference (24.75) between the initial and final photon occupancies to 0

∂ fγ − fγ + fγ ′ = [(pˆγ − pˆγ ′ ) · ve − Θ(pˆγ ) + Θ(pˆγ ′ )] . ∂ ln T Inserting this difference in occupancies into the collision integral (24.74) yields 0

∂ fγ pγ C[fγ ] = 2 2 16πme ~ ∂ ln T 1

Z

|M|2 fe [(pˆγ − pˆγ ′ ) · ve − Θ(pˆγ ) + Θ(pˆγ ′ )]

2 d3pe doγ ′ . (2π~)3 4π

(24.79)

(24.80)

24.11 The photon collision term for electron-photon scattering

401

The transition probability |M|2 , equation (24.67), is independent of electron momenta, so the integration over electron momentum in the collision integral (24.80) is straightforward. The unperturbed electron density n ¯ e and the electron bulk velocity v e are defined by Z 0 Z 0 2 d3pe 2 d3pe fe ve , n ¯ v ≡ . (24.81) n ¯ e ≡ fe e e 3 (2π~) (2π~)3 Coulomb scattering keeps electrons and ions tightly coupled, so the electron bulk velocity ve equals the baryon bulk velocity v b , ve = vb .

(24.82)

Integration over the electron momemtum brings the collision integral (24.80) to 0

∂ fγ n ¯ e pγ C[fγ ] = 16πm2e ~2 ∂ ln T 1

Z

|M|2 [(pˆγ − pˆγ ′ ) · vb − Θ(pˆγ ) + Θ(pˆγ ′ )]

doγ ′ . 4π

(24.83)

Finally, the collision integral (24.83) must be integrated over the direction pˆγ ′ of the scattered photon. Inserting the electron-photon scattering transition probability |M|2 given by equation (24.67) into the collision integral (24.83) brings it to 0

∂ fγ C[fγ ] = n ¯ e σT pγ ∂ ln T 1

Z

  doγ ′ 1 + 12 P2 (pˆγ · pˆγ ′ ) [(pˆγ − pˆγ ′ ) · v b − Θ(pˆγ ) + Θ(pˆγ ′ )] . 4π

(24.84)

The pˆγ ′ · vb term in the integrand of (24.84) is odd, and vanishes on angular integration: Z   doγ ′ =0. 1 + 21 P2 (pˆγ · pˆγ ′ ) pˆγ ′ 4π

(24.85)

Similarly, the angular integral over the quadrupole of quantities independent of pˆγ ′ vanishes: Z doγ ′ P2 (pˆγ · pˆγ ′ ) [pˆγ · vb − Θ(pˆγ )] =0. 4π

(24.86)

The collision integral (24.84) thus reduces to 0

∂ fγ C[fγ (x, pˆγ )] = n ¯ e σT pγ ∂ ln T 1



pˆγ · v b (x) − Θ(x, p ˆγ ) +

Z



1+

1 ˆγ 2 P2 (p

 doγ ′ ˆγ ′ ) · pˆγ ′ ) Θ(x, p 4π



,

(24.87)

where the dependence of various quantities on comoving position x has been made explicit. Now transform to Fourier space (in effect, replace comoving position x by comoving wavevector k). Replace the baryon bulk ˆ b . To perform the remaining angular integral over the photon direction velocity by its scalar part, v b = ikv ˆγ ′ ) in scalar multipole moments according to equation (24.48), pˆγ ′ , expand the temperature fluctuation Θ(k, p and invoke the orthogonality relation (24.134). With µγ defined by ˆ · pˆγ , µγ ≡ k

(24.88)



402

Cosmological perturbations: a more careful treatment of photons and baryons

these manipulations bring the photon collision integral (24.87) at last to 0

 ∂ fγ  C[fγ (k, pˆγ )] = n ¯ e σT pγ − iµγ vb (k) − Θ(k, µγ ) + Θ0 (k) − 12 Θ2 (k)P2 (µγ ) . ∂ ln T 1

(24.89)

24.12 Boltzmann equation for photons Inserting the collision term (24.89) into equation (24.47) yields the photon Boltzmann equation for scalar fluctuations in conformal Newtonian gauge,   dΘ ∂Θ = − ikµγ Θ − Φ˙ − ikµγ Ψ = n ¯ e σT a − iµγ vb − Θ + Θ0 − 21 Θ2 P2 (µγ ) . dη ∂η

(24.90)

Expanded into the scalar harmonics Θℓ (η, k), the photon Boltzmann equation (24.46) yields the hierarchy of photon multipole equations ˙ 0 − kΘ1 − Φ ˙ =0, Θ ˙ 1 + k (Θ0 − 2Θ2 ) + k Ψ = 1 n Θ ¯ e σT a (vb − 3Θ1 ) , 3 3 3 ˙ 2 + k (2Θ1 − 3Θ3 ) = − 9 n ¯ e σT aΘ2 , Θ 5 10 ˙ ℓ + k [ℓΘℓ−1 − (ℓ + 1)Θℓ+1 ] = −¯ ne σT aΘℓ (ℓ ≥ 3) . Θ 2ℓ + 1

(24.91a) (24.91b) (24.91c) (24.91d)

The Boltzmann hierarchy (24.91) shows that all the photon multipoles are affected by electron-photon scattering, but only the photon dipole Θ1 depends directly on one of the baryon variables, the baryon bulk velocity vb . The dependence on the baryon velocity vb reflects the fact that, to linear order, there is a transfer of momentum between photons and baryons, but no transfer of number or of energy. For the dipole, ℓ = 1, electron-photon scattering drives the electron and photon bulk velocities into near equality, vb ≈ 3Θ1 ,

(24.92)

so that the right hand side side of the dipole equation (24.91b) is modest despite the large scattering factor n ¯ e σT a. The approximation in which the bulk velocities of electrons and photons are exactly equal, vb = 3Θ1 , and all the higher multipoles vanish, Θℓ = 0 for ℓ ≥ 2, is called the tight-coupling approximation. The tightcoupling approximation was already invoked in the “simple” model of Chapter 23. For multipoles ℓ ≥ 2, the electron-photon scattering term on the right hand side of the Boltzmann hierarchy (24.91) acts as a damping term that tends to drive the multipole exponentially into equilibrium ˙ℓ +n (the solution to the homogeneous equation Θ ¯ e σT aΘℓ = 0 is a decaying exponential). As seen in Chapter 23, in the tight-coupling approximation the monopole and dipole oscillate with a natural frequency of ω = cs k, where cs is the sound speed. These oscillations provide a source that propagates upward to higher harmonic number ℓ. For scales much larger than a mean free path, k/(¯ ne σT a) ≪ 1, the time

24.13 Diffusive (Silk) damping

403

˙ ∼ cs k|Θ| ≪ n derivative is small compared to the scattering term, |Θ| ¯ e σT a|Θ|, reflecting the near-equilibrium response of the higher harmonics. For multipoles ℓ ≥ 2, the dominant term on the left hand side of the Boltzmann hierarchy (24.91) is the lowest order multipole, which acts as a driver. Solution of the Boltzmann equations (24.91) then requires that Θℓ+1 ∼

k Θℓ n ¯ e σT a

for ℓ ≥ 2 .

(24.93)

The relation (24.93) implies that higher order photon multipoles are successively smaller than lower orders, |Θℓ+1 | ≪ |Θℓ |, for scales much larger than a mean free path, k/(¯ ne σT a) ≪ 1. This accords with the physical expectation that electron-photon scattering drives the photon distribution to near isotropy.

24.13 Diffusive (Silk) damping The tight coupling between photons and baryons is not perfect, because the mean free path for electronphoton scattering is finite, not zero. The imperfect coupling causes sound waves to dissipate at scales comparable to the mean free path. To lowest order, the dissipation can be taken into account by including the photon quadrupole Θ2 in the system (24.91) of photon multipole equations, but still neglecting the higher multipoles, Θℓ = 0 for ℓ ≥ 3. According to the estimate (24.93), this approximation is valid for scales much larger than a mean free path, k/(¯ ne σT a) ≪ 1. The approximation is equivalent to a diffusion approximation. In the diffusion approximation, the photon quadrupole equation (24.91c) reduces to Θ2 = −

4k Θ1 . 9¯ ne σT a

(24.94)

Substituted into the photon momentum equation (24.91b), the photon quadrupole Θ2 (24.94) acts as a source of friction on the photon dipole Θ1 .

24.14 Baryons The equations governing baryonic matter are similar to those governing non-baryonic cold dark matter, §24.5, except that baryons are collisional. Coulomb scattering between electrons and ions keep baryons tightly coupled to each other. Electron-photon scattering couples baryons to the photons. Since the unperturbed distribution of baryons is in thermodynamic equilibrium, the unperturbed collision term vanishes for each species of baryonic matter, as it did for photons, equation (24.95), 0

C[fb ] = 0 .

(24.95)

For the perturbed baryon distribution, only the first and second moments of the phase-space distribution are important, since these govern the baryon overdensity δb and bulk velocity v b . The relevant collision term



404

Cosmological perturbations: a more careful treatment of photons and baryons

is the electron collision term associated with electron-photon scattering. Since electron-photon scattering neither creates nor destroys electrons, the zeroth moment of the electron collision term vanishes, Z 1 2d3pe =0. (24.96) C[fe ] me (2π~)3 The first moment of the electron collision term is most easily determined from the fact that electron-photon collisions must conserve the total momentum of electron and photons: Z Z 1 1 2 d3pγ 2 d3pe + C[fγ ] pγ =0. (24.97) C[fe ] me ve 3 me (2π~) pγ (2π~)3 Substituting the expression (24.87) for the photon collision integral into equation (24.97), separating out factors depending on the absolute magnitude pγ and direction pˆγ of the photon momentum, and taking into consideration that the integral terms in equation (24.87), when multiplied by pˆγ , are odd in pˆγ , and therefore vanish on integration over directions pˆγ , gives 0

Z

2 d3pe C[fe ] me ve =n ¯ e σT me (2π~)3 1

Z

∂ fγ 2 2 4πp2γ dpγ p ∂ ln T γ pγ (2π~)3

Z

[− pˆγ · v b + Θ(pˆγ )] pˆγ

doγ . 4π

(24.98)

The integral over the magnitude pγ of the photon momentum in equation (24.98) yields 4ρ, ¯ in accordance with equation (24.54). Transformed into Fourier space, and with only scalar terms retained, the collision ˆ · pˆγ , integral (24.98) becomes, with µγ ≡ k Z Z 1 doγ 2 d3pe ˆ = 4ρ¯γ n ¯ e σT [iµγ vb + Θ] µγ k · C[fe ] me ve me (2π~)3 4π 4 ¯ e σT (vb − 3Θ1 ) . (24.99) = iρ¯γ n 3 The result is that the equations governing the baryon overdensity δb and scalar bulk velocity vb look like those (24.38) governing non-baryonic cold dark matter, except that the velocity equation has an additional source (24.99) arising from momentum transfer with photons through electron-photon scattering: ˙ =0, δ˙b − kvb − 3Φ a˙ n ¯ e σT a v˙ b + vb + kΨ = − (vb − 3Θ1 ) , a R where R is

3 4

(24.100a) (24.100b)

the baryon-to-photon density ratio, R≡

3ρ¯b a = Ra , 4ρ¯γ aeq

Ra =

3gρ Ωb ≈ 0.21 , 8Ωm

(24.101)

 4 4/3 = 3.36 being the energy-weighted effective number of relativistic particle species with gρ = 2 + 6 78 11 at around the time of recombination (Exercise 10.18). The parameter R plays an important role in that it modulates the sound speed in the photon-baryon fluid, equation (24.110) below.

24.15 Viscous baryon drag damping

405

24.15 Viscous baryon drag damping A second source of damping of sound waves, distinct from the diffusive damping of §24.13, arises from the viscous drag on photons that results from a small difference vb − 3Θ1 between the baryon and photon bulk velocities. This damping is associated with the finite mass density of baryons, and vanishes in the limit of small baryon density. However, for realistic values of the baryon density, viscous baryon drag damping is comparable to diffusive damping. Equation (24.100b) for the baryon bulk velocity may be written   a˙ R v˙ b + vb + kΨ . (24.102) vb − 3Θ1 = − n ¯ e σT a a The right hand side of this equation is small because the scattering rate n ¯ e σT is large, so to lowest order vb = 3Θ1 , the tight-coupling approximation. To ascertain the effect of a small difference in the baryon and photon bulk velocities, take equation (24.102) to next order. In the circumstances where damping is important, which is small scales well inside the sound horizon, kηs ≫ 1, the dominant term on the right hand side of equation (24.100b) is the time derivative v˙ b . With only this term kept, equation (24.102) reduces to vb − 3Θ1 ≈ −

R 3R ˙ Θ1 . v˙ b ≈ − n ¯ e σT a n ¯ e σT a

(24.103)

Substituting the approximation (24.103) into the left hand side of the baryon velocity equation (24.100b) gives   a˙ ¨1 , ˙ 1 + a˙ Θ1 + k Ψ − 3R Θ (24.104) v˙ b + vb + kΨ = 3 Θ a a 3 n ¯ e σT a in which the last term on the right hand side is the small correction from the non-vanishing baryon-photon velocity difference, equation (24.103). As in the approximation (24.103), only the dominant term, the one arising from the time derivative v˙ b , has been retained in the correction, and in differentiating the right hand side of equation (24.103), the time derivative of 3R/(¯ ne σT a) has been neglected compared to the time ˙ 1 . The final simplification is to replace the second time derivative of the photon dipole in derivative of Θ ¨ 1 ≈ −c2 k 2 Θ1 , an approximation that is valid because the the correction term by its unperturbed value, Θ s correction term is already small. Here cs is the sound speed, found in the next section to be given by equation (24.110). The result is   2 2 a˙ ˙ 1 + a˙ Θ1 + k Ψ + 3Rcs k Θ1 . v˙ b + vb + kΨ = 3 Θ (24.105) a a 3 n ¯ e σT a This equation (24.105) is used to develop the photon-baryon wave equation in the next section, §24.16.

24.16 Photon-baryon wave equation Combining the photon monopole and dipole equations (24.91a) and (24.91b) with the baryon momentum equation (24.100b), and making the diffusion approximation (24.94) for the photon quadrupole, and the

406



Cosmological perturbations: a more careful treatment of photons and baryons

approximation (24.105) for the baryon bulk velocity, yields a second order differential equation (24.113) that captures accurately the behaviour of the photon distribution. The equation is that for a damped harmonic oscillator, forced by the gravitational potential terms on its right hand side. Adding the momentum equations (24.91b) and (24.100b) for photons and baryons yields an equation that expresses momentum conservation for the combined photon-baryon fluid,   ˙ 1 + k (Θ0 − 2Θ2 ) + k Ψ + R v˙ b + a˙ vb + kΨ = 0 . (24.106) Θ 3 3 3 a Setting the photon quadrupole Θ2 equal to its diffusive value (24.94), and substituting the baryon velocity equation (24.105), brings the photon-baryon momentum conservation equation (24.106) to   8k 2 k k R2 c2s k 2 a˙ k ˙ ˙ Θ1 + Ψ + R Θ1 + Θ1 + Ψ + Θ1 = 0 . (24.107) Θ1 + Θ0 + 3 27¯ ne σT a 3 a 3 n ¯ e σT a Equation (24.107) rearranges to    k 8 R a˙ k2 R2 k d Θ1 + + + + Θ0 + Ψ = 0 , dη 1 + R a 3¯ ne σT a(1 + R) 9 1 + R 3(1 + R) 3

(24.108)

where equation (24.110) has been used to replace the sound speed cs . Finally, eliminating the dipole Θ1 in favour of the monopole Θ0 using the photon monopole equation (24.91a) yields a second order differential equation for Θ0 − Φ:      2 k2 8 d k2 R2 k2 R a˙ d (Θ −Φ) = − + + + + [(1 + R)Ψ + Φ] . 0 dη 2 1 + R a 3¯ ne σT a(1 + R) 9 1 + R dη 3(1 + R) 3(1 + R) (24.109) Equation (24.109) is a wave equation for a damped, driven oscillator with sound speed s 1 cs = , (24.110) 3(1 + R) which is the adiabatic sound speed cs for a fluid in which photons provide all the pressure, but both photons and baryons contribute to the mass density. The term proportional to a/a ˙ on the left hand side of equation (24.109) can be expressed as, since R ∝ a, c˙ s R a˙ = −2 . 1+Ra cs

(24.111)

dηs ≡ cs dη ,

(24.112)

Define the conformal sound time ηs by

with respect to which sound waves move at unit velocity, unit comoving distance per unit conformal time, dx/dηs = 1. Recast in terms of the conformal sound time ηs , the wave equation (24.109) becomes  ′     2 8 d k 2 cs cs R2 d 2 + + k + − + (Θ0 − Φ) = − k 2 [(1 + R)Ψ + Φ] , (24.113) dηs2 cs n ¯ e σT a 9 1 + R dηs

24.17 Damping of photon-baryon sound waves

407

where prime ′ denotes derivative with respect to conformal sound time, c′s = dcs /dηs . The “simple” photon wave equation (23.45) derived in Chapter 23 is obtained from the wave equation (24.113) in the limit of negligible baryon-to-photon density, R ≈ 0.

24.17 Damping of photon-baryon sound waves The terms proportional to the linear derivative d/dηs in the wave equation (24.113) are damping terms, the first being an adiabatic damping term associated with variation of the sound speed, and the others being dissipative damping terms associated respectively with the finite mean free path of electron-photon scattering, and with viscous baryon drag. Lump these terms into a damping parameter κ defined by  ′   c 8 k 2 cs R2 1 − s+ . (24.114) + κ≡ 2 cs n ¯ e σT a 9 1 + R The damping parameter κ varies slowly compared to the frequency of the sound wave, so κ can be treated as approximately constant. The homogeneous wave equation (equation (24.113) with zero on the right hand side) can then be solved by introducing a frequency ω defined by Θ0 − Φ ∝ e

R

ω dηs

.

(24.115)

The homogeneous wave equation (24.113) is then equivalent to ω ′ + ω 2 + 2κω + k 2 = 0 .

(24.116)

Since the damping parameter κ is approximately constant (and the comoving wavevector k is by definition constant), ω ′ is small compared to the other terms in equation (24.116). With ω ′ set to zero, the solution of equation (24.116) is p (24.117) ω = − κ ± i k 2 − κ2 ≈ −κ ± ik , where the last approximation is valid since the damping rate is small compared to the frequency, κ ≪ k. Thus the homogeneous solutions of the wave equation (24.113) are Θ 0 − Φ ∝ e−

R

κ dηs ± ikηs

.

(24.118)

In the present case, the first of the sources (24.114) of damping is the adiabatic damping term κa ≡ −

1 c′s . 2 cs

R The integral of the adiabatic damping term is κa dηs = − 12 ln cs , whose exponential is R √ e− κa dηs = cs .

(24.119)

(24.120)

This shows that, as the sound speed decreases thanks to the increasing baryon-to-photon density in the expanding Universe, the amplitude of a sound wave decreases as the square root of the sound speed.

408



Cosmological perturbations: a more careful treatment of photons and baryons

The remaining damping terms are the dissipative terms   8 R2 k 2 cs . + κd ≡ 2¯ ne σT a 9 1 + R The integral of the dissipative damping terms is Z

κd dηs =

k2 , kd2

(24.121)

(24.122)

where kd is the damping scale defined by     Z Z 1 1 cs 8 8 R2 R2 dη = dη . ≡ + + s kd2 2¯ ne σT a 9 1 + R 6¯ ne σT a(1 + R) 9 1 + R

(24.123)

The resulting damping factor is e−

R

κd dηs

= e−k

2

2 /kd

.

(24.124)

Thus the effect of dissipation is to damp temperature fluctuations exponentially at scales smaller than the diffusion scale kd . With adiabatic and diffusion damping included, the homogeneous solutions to the wave equation (24.113) are approximately 2 2 √ (24.125) Θ0 − Φ ∝ cs e−k /kd e±ikηs . The driving potential on the right hand side of the wave equation (24.113) causes Θ0 − Φ to oscillate not around zero, but rather around the offset − [(1 + R)Ψ + Φ]. At the high frequencies where damping is important, this driving potential also varies slowly compared to the wave frequency. To the extent that the driving potential is slowly varying, the complete solution of the inhomogeneous wave equation (24.113) is 2 2 √ (24.126) Θ0 + (1 + R)Ψ ∝ cs e−k /kd e±ikηs . As will be seen in Chapter 25, the monopole contribution to CMB fluctuations is not the photon monopole Θ0 by itself, but rather Θ0 +Ψ, which is the monople redshifted by the potential Ψ. This redshifted monopole is 2 2 √ Θ0 + Ψ = − RΨ + A cs e−k /kd e±ikηs . (24.127) Exercise 24.4 Diffusion scale. Show that the damping scale kd defined by (24.123) is given by, with a normalized to aeq = 1, √   Z 2 Heq 8 Ωm a 8 2πGmp R2 a2 √ da , (24.128) = + kd2 9cσT (1 − Xn )Heq Ωb 0 Xe 1 + a(1 + R) 9 1 + R the Hubble parameter Heq at matter-radiation equality being related to the present-day Hubble parameter H0 by equation (23.41). If baryons are taken to be fully ionized, Xe = 1, which ceases to be a good

24.18 Ionization and recombination

409

approximation near recombination, then the integral can be done analytically. At times well after matterradiation equality,   Z a 2 a2 R2 8 √ f (a) ≡ da → a5/2 for a ≫ 1 , + (24.129) 9 1 + R 5 1 + a(1 + R) 0 independent of the value of the constant Ra ≡ R/a. Conclude that, neglecting the effect of recombination on the electron fraction Xe , 2 Heq 6.821 h−1 H0 Ωm f (a/aeq ) = 4.9 × 10−4 f (a/aeq) ≈ 0.0016 = kd2 1 − Xn Heq Ωb



a a∗

5/2

,

(24.130)

where a∗ is the cosmic scale factor at recombination.

24.18 Ionization and recombination 24.19 Neutrinos Before electron-positron annihilation at temperature T ≈ 1 MeV, weak interactions were fast enough that scattering between neutrinos, antineutrinos, electrons, and positrons kept neutrinos and antineutrinos in thermodynamic equilibrium with baryons. After e¯ e annihilation, neutrinos and antineutrinos decoupled, rather like photons decoupled at recombination. After decoupling, neutrinos streamed freely.

24.20 Summary of equations Non-baryonic cold dark matter, baryons, photons, neutrinos: δ˙c − kvc a˙ v˙ c + vc a ˙δb − kvb a˙ v˙ b + vb a ˙ Θ0 − kΘ1

˙ , = 3Φ

(24.131a)

= −kΨ ,

(24.131b)

˙ , = 3Φ

(24.131c)

n ¯ e σT a = − kΨ − (vb − 3Θ1 ) , R ˙ , =Φ ˙ 1 + k (Θ0 − 2Θ2 ) = − k Ψ + 1 n ¯ e σT a (vb − 3Θ1 ) , Θ 3 3 3 9 2k Θ1 = − n ¯ e σT aΘ2 , 5 10

(24.131d) (24.131e) (24.131f) (24.131g)

410



Cosmological perturbations: a more careful treatment of photons and baryons

Einstein energy and quadrupole pressure equations: a˙ F = 4πGa2 (¯ ρc δc + ρ¯b δb + 4ρ¯γ Θ0 + 4ρ¯ν N0 ) , a k 2 (Ψ − Φ) = − 32πGa2 (¯ ρr Θ2 + ρ¯ν N2 ) ,

− k2 Φ − 3

(24.132a) (24.132b)

24.21 Legendre polynomials The Legendre polynomials Pℓ (µ) satisfy the orthogonality relations Z 1 2 δℓℓ′ Pℓ (µ)Pℓ′ (µ) dµ = 2ℓ +1 −1

(24.133)

and

the recurrence relation

Z

z·ˆ b) doz = Pℓ (ˆ z·a ˆ)Pℓ′ (ˆ

µPℓ (µ) =

4π Pℓ (ˆ a·ˆ b) δℓℓ′ , 2ℓ + 1

1 [ℓPℓ−1 (µ) + (ℓ + 1)Pℓ+1 (µ)] , 2ℓ + 1

(24.134)

(24.135)

and the derivative relation ℓ+1 dPℓ (µ) [µPℓ−1 (µ) − Pℓ+1 (µ)] . = dµ 1 − µ2

(24.136)

The first few Legendre polynomials are

P0 (µ) = 1 ,

P1 (µ) = µ ,

P2 (µ) = −

1 3 2 + µ . 2 2

(24.137)

25 Fluctuations in the Cosmic Microwave Background

25.1 Primordial power spectrum Inflation generically predicts gaussian initial fluctuations with a scale-free power spectrum, in which the variance of the potential is the same on all scales, hΦ(x′ )Φ(x)i ≡ ξΦ (|x′ − x|) = constant ,

(25.1)

independent of separation |x′ − x|. A scale-free primordial power spectrum was originally proposed as a natural initial condition by Harrison and Zel’dovich (Harrison, 1970, PRD 1, 2726; Zel’dovich, 1972, MNRAS, 160, 1P), before the idea of inflation was conceived. During inflation, vacuum fluctuations generate fluctuations in the potential which become frozen in as they fly over the horizon. The amplitude of these fluctuations remains constant as inflation continues, producing a scale-free power spectrum. The power spectrum PΦ (k) of potential fluctuations is defined by hΦ(k′ )Φ(k)i ≡ (2π)3 δD (k′ + k)PΦ (k) ,

(25.2)

the power spectrum PΦ (k) being related to the correlation function ξΦ (x) by (with the standard convention in cosmology for the choice of signs and factors of 2π) Z Z d3k . (25.3) PΦ (k) = eik·x ξΦ (x) d3x , ξΦ (x) = e−ik·x PΦ (k) (2π)3 The scale-free character means that the dimensionless power spectrum ∆2Φ (k) defined by ∆2Φ (k) ≡ PΦ (k)

4πk 3 (2π)3

(25.4)

is constant. Actually, the power spectrum generated by inflation is not precisely scale-free, because inflation comes to an end, which breaks scale-invariance. The departure from scale-invariance is conventionally characterized by a scalar spectral index, the tilt n, such that ∆2Φ (k) ∝ k n−1 .

(25.5)

412

Fluctuations in the Cosmic Microwave Background

Thus a scale-invariant power spectrum has n = 1 (scale-invariant) .

(25.6)

Different inflationary models predict different tilts, mostly close to but slightly less than 1.

25.2 Normalization of the power spectrum It is convenient to normalize the power spectrum of potential fluctuations to its amplitude at the recombination distance today, k(η0 − η∗ ) = 1, and not to the initial potential Φ(0) (which vanishes for isocurvature initial conditions), but rather to the late-time matter-dominated potential Φ(late) at superhorizon scales, ∆2Φ(late) (k) = A2late [k(η0 − η∗ )]n−1 .

(25.7)

The relation between the superhorizon late-time matter-dominated potential Φ(late) and the primordial potential Φ(0) is, equations (23.61), ( 9 10 Φ(0) adiabatic , (25.8) Φ(late) = 8 ′ isocurvature . 5 Φ (0)

25.3 CMB power spectrum The power spectrum of temperature fluctuations in the CMB is defined by Z ∞ 3 2 d k . |Θℓ (η0 , k)| Cℓ (η0 ) ≡ 4π (2π)3 0

(25.9)

As seen in Chapter §23, during linear evolution, scalar modes of given comoving wavevector k evolve with amplitude proportional to the initial value Φ(0, k) of the scalar potential (or of its derivative Φ′ (0, k), for isocurvature initial conditions). The evolution of the amplitude may be encapsulated in a transfer function Tℓ (η, k) defined by Tℓ (η, k) ≡

Θℓ (η, k) , Φ(late, k)

(25.10)

where Φ(late) is the superhorizon late-time matter-dominated potential. By isotropy, the transfer function Tℓ (η, k) is a function only of the magnitude k of the wavevector k. The power spectrum of the CMB observed on the sky today is related to the primordial power spectrum PΦ(late) (k) or ∆2Φ(late) (k) by Cℓ (η0 ) = 4π

Z

0



2

|Tℓ (η0 , k)| PΦ(late) (k)

4πk 2 dk = 4π (2π)3

Z

0



2

|Tℓ (η0 , k)| ∆2Φ(late) (k)

dk . k

(25.11)

25.4 Matter power spectrum

413

25.4 Matter power spectrum The matter power spectrum Pm (k) is defined by hδm (k′ )δm (k)i ≡ (2π)3 δD (k′ + k)Pm (k) .

(25.12)

At times well after recombination, the matter power spectrum Pm (η, k) at conformal time η is related to the potential power spectrum (25.2) by, equation (23.117), in units a0 = 1,  2 2  2a (2π)3 2a 4 Pm (η, k) = k ∆2Φ (k, η) . (25.13) k P (η, k) = Φ 3Ωm H02 3Ωm H02 4π At superhorizon scales, the potential Φ(η, k) at conformal time η is related to the late-time matter-dominated potential by Φ(η) = g(a) Φ(late)

for kη ≪ 1 ,

(25.14)

where g(a) is the grown factor defined by equation (23.116). For a power-law primordial spectrum (25.5), the matter power spectrum at the largest scales goes as Pm (η, k) ∝ k n ,

(25.15)

which explains the origin of the scalar index n.

25.5 Radiative transfer of CMB photons To determine the harmonics Θℓ (η0 , k) of the CMB photon distribution at the present time, return to the Boltzmann equation (24.90) for the photon distribution Θ(η, k, µ), where µ ≡ k · p: ˆ   1 ˙ − ikµγ Θ + n (25.16) Θ ¯ e σT aΘ = Φ˙ + ikµγ Ψ + n ¯ e σT a − iµγ vb + Θ0 − 2 Θ2 P2 (µγ ) . This equation is also called the radiation transfer equation. The terms on the right are the source terms. Define the electron-photon (Thomson) scattering optical depth τ by dτ ≡ −¯ ne σT a , dη

(25.17)

starting from zero, τ0 = 0, at zero redshift, and increasing going backwards in time η to higher redshift. The photon Boltzmann equation (25.16) can be written eikµγ η+τ where Si are source terms

d −ikµγ η−τ  e Θ = S0 − iµγ S1 + (−iµγ )2 S2 , dη  S0 ≡ Φ˙ − τ˙ Θ0 + 14 Θ2 ,

S1 ≡ −kΨ − τ˙ vb = −kΨ − 3τ˙ Θ1 , S2 ≡ −τ˙ 43 Θ2 ,

(25.18)

(25.19a) (25.19b) (25.19c)

414

Fluctuations in the Cosmic Microwave Background

where in S1 the tight-coupling approximation vb = 3Θ1 has been used to replace the baryon bulk velocity vb with the photon dipole Θ1 . Thus a solution for the photon distribution Θ(η0 , k, µγ ) today is, at least formally, an integral over the line of sight from the Big Bang to the present time, Z η0   (25.20) S0 − iµγ S1 + (−iµγ )2 S2 e−ikµγ (η−η0 )−τ dη . Θ(η0 , k, µγ ) = 0

The −iµγ dependence of the source terms inside the integral can be accomodated through n  1 ∂ n −ikµγ (η−η0 ) (−iµγ ) e = e−ikµγ (η−η0 ) , k ∂η

which brings the formal solution (25.20) for Θ(η0 , k, µγ ) to   Z η0 1 ∂ 1 ∂2 −τ S0 + S1 + S2 2 2 e−ikµγ (η−η0 ) dη . e Θ(η0 , k, µγ ) = k ∂η k ∂η 0

(25.21)

(25.22)

The dipole source term S1 contains a −Ψ term which it is helpful to integrate by parts: Z η0 h iη0 Z η0 ∂  ∂ −ikµγ (η−η0 ) −τ −τ −ikµγ (η−η0 ) −e Ψ e + dη = −e Ψe e−τ Ψ e−ikµγ (η−η0 ) dη ∂η ∂η 0 0 0 Z η0   ˙ − τ˙ Ψ e−ikµγ (η−η0 ) dη . (25.23) e−τ Ψ = − Ψ(η0 ) + 0

With the source terms written out explicitly, the present-day photon distribution Θ(η0 , k, µγ ), equation (25.20), is Θ(η0 , k, µγ ) + Ψ(η0 , k)   Z η0    1 1 ∂ 1 ∂2 3 −τ −τ ˙ ˙ Θ0 + Ψ + Θ2 + 3Θ1 Ψ + Φ − τ˙ e e = e−ikµγ (η−η0 ) dη . + Θ2 4 k ∂η 4 k 2 ∂η 2 0

(25.24)

The spherical harmonics Θℓ (η, k) of the photon distribution have been defined by equation (24.48). The e−ikµγ (η−η0 ) factor in the integral in equation (25.24) can be expanded in spherical harmonics through the general formula ∞ X ˆ·x eik·x = iℓ (2ℓ + 1)jℓ (kx)Pℓ (k ˆ) , (25.25) ℓ=0

p where jℓ (z) ≡ π/(2z)Jℓ+1/2 (z) are spherical Bessel functions. Resolved into harmonics, equation (25.24) becomes [Θℓ (η0 , k) + δℓ0 Ψ(η0 , k)]   Z η0    2 ˙ +Φ ˙ − τ˙ e−τ Θ0 + Ψ + 1 Θ2 + 3Θ1 1 ∂ + 3 Θ2 1 ∂ jℓ [k(η0 − η)] dη . e−τ Ψ = 4 k ∂η 4 k 2 ∂η 2 0

(25.26)

Introduce a visibility function g(η) defined by

g(η) ≡ −τ˙ e−τ ,

(25.27)

25.6 Integrals over spherical Bessel functions

415

whose integral is one, Z

η0

g(η) dη =

Z

0



0

0  −e−τ dτ = e−τ ∞ = 1 .

(25.28)

The visibility function is fairly narrowly peaked around recombination at η = η∗ . In this approximation of instantaneous recombination, Z η0   ˙ ˙ k) + Φ(η, k) jℓ [k(η0 − η)] dη ISW e−τ Ψ(η, [Θℓ (η0 , k) + δℓ0 Ψ(η0 , k)] = 0

+ [Θ0 (η∗ , k) + Ψ(η∗ , k)] jℓ [k(η0 − η∗ )]



3 Θ1 (η∗ , k) jℓ′

+ Θ2 (η∗ , k)

1

[k(η0 − η∗ )]

4 jℓ

[k(η0 − η∗ )] +

monopole

(25.29)

dipole 3 ′′ 4 jℓ

[k(η0 − η∗ )]

quadrupole .

Here prime ′ on jℓ denotes a total derivative, jℓ′ (z) = djℓ (z)/dz. The first and second derivatives of the spherical Bessel functions are jℓ′ (z) =

ℓ jℓ (z) − jℓ+1 (z) , z

jℓ′′ (z) =

2 ℓ(ℓ − 1) − z 2 jℓ (z) + jℓ+1 (z) . z2 z

(25.30)

Notice that the monopole term (on both sides of equation (25.29)) is not Θ0 but rather Θ0 + Ψ, which is the temperature fluctuation redshifted by the potential Ψ. On the left hand side, the δℓ0 term arises from the redshift at our position today, but this monopole perturbation just adds to the mean unperturbed termperature, and is not observable.

25.6 Integrals over spherical Bessel functions Computing the photon harmonics Θl by integration of equation (25.26), or the CMB power spectrum Cℓ in the instantaneous recombination approximation by integration of equation (25.11) with (25.29), involves evaluating integrals of the form Z ∞ dz , (25.31) f (z)g(qz) z 0 with g(z) ≡ jℓ (z)z n−1

or g(z) ≡ jℓ (z)jℓ′ (z)z n−1 .

(25.32)

Such integrals present a challenge because of the oscillatory character of the functions g(z). This section presents a method to evaluate such integrals reliably. Integrals with different ℓ are related by recursion relations (25.38) that permit rapid evaluation over many ℓ. Details are left to Exercise 25.2. The approach is to recast the integral (25.31) into Fourier space with respect to ln z, and to apply a Fast Fourier Transform of f (z) over a logarithmic interval. This involves replacing the true f (z) with a function that is periodic in ln z over a logarithmic interval [−L/2, L/2] of width L centred on ln z = 0. The

416

Fluctuations in the Cosmic Microwave Background

approximation works because the functions g(z) given by equation (25.32) tend to zero at z → 0 and z → ∞, so spurious periodic duplications at small and large z contribute negligibly to the integral (25.31) provided that the logarithmic interval L is chosen sufficiently broad. The logarithmic interval L can be broadened to whatever extent is necessary by extrapolating the function f (z) to smaller and larger z. If f (z) is periodic in ln z over a logarithmic interval L, then f (z) is a sum of discrete Fourier modes e2πim ln(z)/L in which m is integral. If f (z) is a smooth function, then f (z) may be adequately approximated by a finite number N of discrete modes m = −[(N − 1)/2], ..., [N/2], where [N/2] denotes the largest integer greater than or equal to N/2. Under these circumstances, the function f (z) is given by a discrete Fourier expansion whose Fourier components fm are related to the values f (zn ) at N logarithmically spaced points zn = enL/N , [N/2]

X

f (z) =

fm e2πim ln(z)/L ,

fm =

m=−[(N −1)/2]

1 N

[N/2]

X

f (zn )e−2πimn/N .

(25.33)

n=−[(N −1)/2]

Since f (z) is real, the negative frequency modes are the complex conjugates of the positive frequency modes, ∗ f−m = fm ,

(25.34)

and for even N (the usual choice) the outermost (Nyquist) frequency mode fN/2 is real. For a function f (z) given by the Fourier sum (25.33), the integral (25.31) is ∞

Z

f (z)g(qz)

0

dz = z

[N/2]

X

fm q −2πim/L

Z

0

m=−[(N −1)/2]



g(z)z 2πim/L

dz . z

(25.35)

For functions g(z) given by equation (25.32), the integrals over g(z) on the right hand side of equation (25.35) can be done analytically: Z

0



dz = f (z)g(qz) z

[N/2]

X

fm q

m=−[(N −1)/2]

−2πim/L

  2πim U n−1+ , L

(25.36)

where for g(z) = jℓ (z)z n−1 or g(z) = jℓ (z)jℓ′ (z)z n−1 the function U (x) is respectively

Uℓ (x) ≡ Uℓℓ′ (x) ≡

Z

0



Z

0



 √  2x−2 πΓ 12 (ℓ + x)  ,  jℓ (z)z = z Γ 12 (ℓ − x + 3) x dz

(25.37a)

  2x−3 πΓ(2 − x)Γ 21 (ℓ + ℓ′ + x)      . (25.37b) jℓ (z)jℓ′ (z)z = 1 z Γ 2 (ℓ + ℓ′ − x + 4) Γ 21 (ℓ − ℓ′ − x + 3) Γ 12 (ℓ′ − ℓ − x + 3) x dz

25.7 Large-scale CMB fluctuations

417

Recurrence relations such as Uℓ (x) = (ℓ + x − 2) Uℓ−1 (x − 1) ℓ+x−2 = Uℓ−2 (x) , ℓ−x+1 ℓ + ℓ′ + x − 2 Uℓℓ′ (x) = Uℓ−1,ℓ′ −1 (x) ℓ + ℓ′ − x + 2 (ℓ + ℓ′ + x − 2)(ℓ − ℓ′ + x − 3) = Uℓ−1,ℓ′ (x − 1) 2(x − 2) (ℓ + ℓ′ + x − 2)(ℓ − ℓ′ + x − 3) = Uℓ−2,ℓ′ (x) , (ℓ + ℓ′ − x + 2)(ℓ′ − ℓ + x − 1)

(25.38a) (25.38b) (25.38c) (25.38d) (25.38e)

permit rapid evaluation of the functions Uℓ (x) or Uℓℓ′ (x) as a function of ℓ and ℓ′ , starting from small ℓ, ℓ′ . As written, the right hand side of equation (25.36) has a small imaginary part arising from the contribution of the outermost (Nyquist) mode, m = [N/2], but this imaginary part should be dropped since it cancels when averaged with the contribution of its negative frequency partner m = −[N/2].

25.7 Large-scale CMB fluctuations The behaviour of the CMB power spectrum at the largest angular scales was first predicted by R. K. Sachs & A. M. Wolfe (1967, Astrophys. J., 147, 73), and is therefore called the “Sachs-Wolfe effect,” though why it should be called an effect is mysterious. The Sachs-Wolfe (SW) effect is distinct from, but modulated by, the Integrated Sachs-Wolfe (ISW) effect. The ISW effect, ignored in this section, is considered in §25.9. At scales much larger than the sound horizon at recombination, kηs,∗ ≪ 1, the redshifted monopole fluctuation Θ0 (η∗ , k) + Ψ(η∗ , k) at recombination is much larger than the dipole Θ1 (η∗ , k) or quadrupole Θ2 (η∗ , k), so only the monopole contributes materially to the temperature multipoles Θℓ (η0 , k) today. The redshifted monopole contribution to the temperature multipoles Θℓ (η0 , k) today is, from equation (25.29), Θℓ (η0 , k) = [Θ0 (η∗ , k) + Ψ(η∗ , k)] jℓ [k(η0 − η∗ )] .

(25.39)

At the very largest scales, kηeq ≪ 1, the solution for the redshifted radiation monopole Θ0 + Ψ at the time η∗ of recombination is, from equation (23.62), 3 Θ0 (η∗ , k) + Ψ(η∗ , k) = 2Φsuper (η∗ , k) − Φ(0) 2 ASW (η∗ ) Φsuper (late, k) , ≡ Alate

(25.40)

where the last expression defines the Sachs-Wolfe amplitude ASW (η∗ ) at recombination. In the approximation that recombination happens well into the matter-dominated regime, so that Φsuper (η∗ , k) ≈ Φsuper (late, k),

418

Fluctuations in the Cosmic Microwave Background

from equations (23.61), ASW (late) 3Φ(0) ASW (η∗ ) ≈ =2− = Alate Alate 2Φ(late)

(

1 3

adiabatic ,

2

isocurvature .

(25.41)

In reality, recombination occurs only somewhat into the matter-dominated regime, and the solutions for the potential Φsuper (η∗ , k) from §23.9 should be used in place of the approximation (25.41). Putting equations (25.39) and (25.40) together shows that the transfer function Tℓ (η0 , k), equation (25.10) that goes into the present-day CMB angular power spectrum Cℓ (η0 ), equation (25.11), is Tℓ (η0 , k) =

ASW (η∗ ) jℓ [k(η0 − η∗ )] . Alate

(25.42)

If the primordial power spectrum is a power law with tilt n, equation (25.7), then the resulting CMB angular power spectrum is, with z = k(η0 − η∗ ), Z ∞ dz Cℓ (η0 ) = 4πASW (η∗ )2 jℓ (z)2 z n−1 = 4πASW (η∗ )2 Uℓ,ℓ (n − 1) , (25.43) z 0 where Uℓ,ℓ (x) is given by equation (25.37b). For the particular case of a scale-invariant primordial power spectrum, n = 1, the CMB power spectrum Cℓ at large scales today is given by ℓ(ℓ + 1)Cℓ (η0 ) = 2πASW (η∗ )2

if n = 1 .

(25.44)

Thus the characteristic feature of a scale-invariant primordial power spectrum, n = 1, is that ℓ(ℓ + 1)Cℓ should be approximately constant at the largest angular scales, ℓ ≪ η0 /η∗ . This is a primary reason why CMB folk routinely plot ℓ(ℓ + 1)Cℓ , rather than Cℓ .

25.8 Monopole, dipole, and quadrupole contributions to Cℓ At smaller scales, kη∗ > ∼ 1, not only the photon monopole Θ0 (η∗ , k), but also the dipole Θ1 (η∗ , k), and to a small extent the quadrupole Θ2 (η∗ , k), contribute to the temperature multipoles Θℓ (η0 , k) today, equation (25.29). The dipole is related to the monopole by the evolution equation (??) for the monopole. kΘ1 = Θ˙0 − Φ˙ .

(25.45)

25.9 Integrated Sachs-Wolfe (ISW) effect

Concept question 25.1 Cosmic Neutrino Background. Just as photons decoupled at recombination, so also neutrinos decoupled at electron-positron annihilation. Compare qualitatively the expected fluctuations in the CνB to those in the CMB.

25.9 Integrated Sachs-Wolfe (ISW) effect

419

Exercise 25.2 Numerical integration of sequences of integrals over Bessel functions. Write code that solves integrals (25.31) numerically for g(z) given by equation (25.32), using a Fast Fourier Transform, equation (25.36), amd recurrence relations appropriate for the monopole, dipole, and quadrupole contributions to equations (25.26) or (25.29). To compute Uℓ (x) or Uℓℓ′ (x) for the initial ℓ, ℓ′ , you will need to find code that implements the complex Gamma function. Note that most FFT codes store input and output periodic sequences shifted by [N/2] compared to the convention (25.33). That is, an FFT code typically takes an input sequence ordered as {f (z0 ), f (z1 ), ..., f (z[N/2] ), ..., f (z−2 ), f (z−1 )}, with periodic identification f (zn ) = f (zn+N ), and evaluates Fourier coefficients fm as

fm =

N −1 1 X f (zn )e−2πimn/N , N m=0

(25.46)

which yields the same Fourier components fm as (25.33) but in the order {f0 , f1 , ..., f[N/2] , ..., f−2 , f−1 }, with periodic identification fm = fm+N . (1)

Various recurrence relations. For g(z) = jℓ (z)z n−1 , the dipole and quadrupole integrals Uℓ (x) and (2) Uℓ (x) are related to the monopole integral Uℓ (x) (25.37a) by



dz = (1 − x) Uℓ (x − 1) , z 0 Z ∞ 1  x dz ℓ(ℓ + 1) + 2x(x − 2) (2) 3 ′′ Uℓ (x) ≡ 4 jℓ (z) + 4 jℓ (z) z z = (ℓ + x − 2)(ℓ − x + 3) Uℓ (x) . 0 (1)

Uℓ (x) ≡

Z

jℓ′ (z)z x

(25.47a) (25.47b)

The dipole, and quadrupole integrals satisfy the recurrence relations

ℓ + x − 3 (1) U (x) , ℓ − x + 2 ℓ−2 (ℓ + x − 4) [l(l + 1) + 2x(x − 2)] (2) (2) Uℓ (x) = U (x) . (ℓ − x + 3) [l(l − 3) + 2(x − 1)2 ] ℓ−2 (1)

Uℓ (x) =

(25.48a) (25.48b)

For g(z) = jℓ (z)jℓ′ (z)z n−1 , the relations get a bit ugly — it’s a good idea to use Mathematica or a similar program to generate the relations automatically. The various multipole integrals of interest are related to

420

Fluctuations in the Cosmic Microwave Background

the integral Uℓℓ (x) (25.37b) by Z ∞ dz 1−x (0,1) Uℓℓ (x) ≡ jℓ (z)jℓ′ (z)z x = Uℓℓ (x − 1) , (25.49a) z 2 0 Z ∞ 4ℓ(ℓ + 1) − x(x − 2)(x − 3) dz 2 (1,1) = Uℓℓ (x) , (25.49b) Uℓℓ (x) ≡ [jℓ′ (z)] z x z (3 − x)(2ℓ + x − 2)(2ℓ − x + 4) 0 Z ∞   dz x [2ℓ(ℓ + 1) + (x − 1)(x − 2)] (0,2) jℓ (z) 41 jℓ (z) + 43 jℓ′′ (z) z x Uℓℓ (x) ≡ = Uℓℓ (x) , (25.49c) z 2(x − 3)(2ℓ + x − 2)(2ℓ − x + 4) Z0 ∞  dz  (1,2) (25.49d) Uℓℓ (x) ≡ jℓ′ (z) 14 jℓ (z) + 43 jℓ′′ (z) z x z 0 (1 − x) [2ℓ(ℓ + 1)(x − 7) + (x + 1)(x − 3)(x − 4)] = Uℓℓ (x − 1) , 4(x − 4)(2ℓ + x − 3)(2ℓ − x + 5) Z ∞ 2 x dz 1 (2,2) 3 ′′ (25.49e) Uℓℓ (x) ≡ 4 jℓ (z) + 4 jℓ (z) z z = 0 4(ℓ − 1)ℓ(ℓ + 1)(ℓ + 2) [12 + x(x − 2)] + x(x − 2)(x − 5) [4ℓ(ℓ + 1)(x − 6) + (x + 2)(x − 3)(x − 4)] Uℓℓ (x) . 4(x − 3)(x − 5)(2ℓ + x − 2)(2ℓ + x − 4)(2ℓ − x + 4)(2ℓ − x + 6)

The various recurrence relations of interest are

(2ℓ + x − 3) (0,1) U (x) , (25.50a) (2ℓ − x + 3) ℓ−1,ℓ−1 (2ℓ + x − 4) [4ℓ(ℓ + 1) − x(x − 2)(x − 3)] (1,1) (1,1) U (x) , (25.50b) Uℓℓ (x) = (2ℓ − x + 4) [4ℓ(ℓ − 1) − x(x − 2)(x − 3)] ℓ−1,ℓ−1 (2ℓ + x − 4) [2ℓ(ℓ + 1) + (x − 1)(x − 2)] (0,2) (0,2) Uℓℓ (x) = U (x) , (25.50c) (2ℓ − x + 4) [2ℓ(ℓ − 1) + (x − 1)(x − 2)] ℓ−1,ℓ−1 (2ℓ + x − 5) [2ℓ(ℓ + 1)(x − 7) + (x + 1)(x − 3)(x − 4)] (1,2) (1,2) U (x) , (25.50d) Uℓℓ (x) = (2ℓ − x + 5) [2ℓ(ℓ − 1)(x − 7) + (x + 1)(x − 3)(x − 4)] ℓ−1,ℓ−1 (2ℓ + x − 6) (2,2) (2,2) U (x) (25.50e) Uℓℓ (x) = (2ℓ − x + 6) ℓ−1,ℓ−1 4(ℓ − 1)ℓ(ℓ + 1)(ℓ + 2) [12 + x(x − 2)] + x(x − 2)(x − 5) [4ℓ(ℓ + 1)(x − 6) + (x + 2)(x − 3)(x − 4)] . × 4(ℓ − 1)ℓ(ℓ + 1)(ℓ − 2) [12 + x(x − 2)] + x(x − 2)(x − 5) [4ℓ(ℓ − 1)(x − 6) + (x + 2)(x − 3)(x − 4)] (0,1)

Uℓℓ

(x) =

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF