The arrow of time

Why do we experience time the way we do? The future seems a very different beast to the past, but as far as we can tell all of nature's fundamental laws are fully reversible. Take a ball that's just been kicked into the air. Newton tells us that the ball will lose speed as it rises to it's greatest height at which point it will start to fall and that it will trace out a parabola as it does. However, if we play the video backwards, the ball will do exactly the same thing, and this is because the laws apply equally well when you start with "final" conditions instead of "initial" conditions and rewrite the equations in terms of "backwards time" $\tau = -t$ instead of time $t$.

Despite this we do not experience the two directions the same. In this post I will give an argument for why this is the case - one that borrows from thermodynamics, Hamiltonian classical mechanics, and Landauer's Limit.

Phase Space and Entropy

The 19th Century Irish mathematician Hamilton found an incredibly elegant way to rewrite the classical mechanics of Newton. He recognised that the degrees of freedom for a system of $N$ particles could be given by 3 components for each particle's position plus 3 components for each particle's momentum. This means that the entire state of the system can be specified by giving the $6N$ numbers $x_1,x_2,x_3,..., x_{3N-2},x_{3N-1},x_{3N}$, $p_1,p_2,p_3,..., p_{3N-2},p_{3N-1},p_{3N}$. $^{Note:\dagger}$. Hamilton's insight was that this massive vector defines a position in a very high dimensional real vector space and that we can attach an arrow to each location in that space that tells us what happens next if the system passes through that point. And since the next point also has an arrow attached to it we can draw a line which tells us exactly how the system - how the universe - will evolve in time.

In fact Hamilton went further and gave us some equations. If $H(\mathbf{x},\mathbf{p})$ is the total (i.e. kinetic plus potential) energy - the bold typeface indicates that the parameters are vectors - then the evolution of the system is given by

$$
\begin{align}
\dot{x_i} &= \frac{\partial{H}}{\partial{p_i}} \\
\dot{p_i} &= -\frac{\partial{H}}{\partial{x_i}} \\
\end{align}
$$

Physicists call this very high dimensional space, in which reality traces out a path, the phase space.

Entropy $S$ seems at first sight to be totally unrelated to phase space. Its earliest definition was given in terms of movement of heat:

$$
dS = \frac{Q}{T}
$$
This says that the change in entropy of part of a system is equal to the heat energy added to it divided by it's temperature. The context here is that that part of the system should start off in a state of homogeneous temperature and pressure and likewise end up in a homogeneous state. And it should be noted that this is the entropy change for just part of the system. Clearly that heat came from somewhere and that somewhere lost entropy, although perhaps not the same amount since it may be at a different temperature. The famous 2nd law of thermodynamics $dS > 0$ states that for an isolated system the total entropy must increase with time. If the isolated system consists of a hot material and a cold material then heat cannot flow from the cold material to the hot material. If the isolated system is more complex then it is in fact possible for heat to flow in the opposite direction in some parts of the system as long as entropy increases elsewhere more than compensate for the corresponding entropy loss.

Later Boltzmann came up with a very different looking definition for entropy, and this one was absolute rather than differential:

$$
S = k\ ln(\Omega)
$$

In this $k = 1.38\times10^{-23}$ is just a constant, and the $\Omega$ inside the natural logarithm is the "number of different ways one could assign position and momenta to individual particles so as to give rise to a material with the same temperature and pressure" $^{\dagger_2}$. This definition seems a bit dodgy at first because how do you "count" something which is actually continuous? In fact it doesn't matter much what granularity you use: suppose you initially rounded positions to micrometers, and then changed your mind and rounded to 10ths of a micrometer. The result would be that all entropies would be $3Nk ln(10)$ larger with the new definition but - critically - differences in entropy would be unaffected. The same goes for the granularity you choose when deciding what constitutes having "the same temperature and pressure". Boltzmann showed that the thermodynamic and statistical definitions of entropy were in fact the same thing and this led to a re-interpretation of the 2nd law: the universe becomes more disordered over time!

Note that Boltzmann's definition links entropy to phase space. $\Omega$ is in just the volume of a region of phase space. The region corresponds to the bunch of points which all look the same when viewed macroscopically. We can think of phase space as divided into such regions in much the same way foam bubbles can divide up the space inside a bottle.

The question is: why does entropy, defined in this way, always increase? There's certainly nothing in the reversible laws of mechanics that demands it!

Penrose's answer

Roger Penrose gave an excellent answer to this question in his book Cycles of Time. He points out that the universe started in a place of low entropy, so that - in the foam bubble analogy - the initial point in phase space was where all the bubbles were really really small. If we imagine that the point in phase space corresponding to "now" is to all intents and purposes moving along a random walk, then it is inevitable that it's going to move to larger and larger bubbles, simply because it started off in a very rare place where the bubbles were much smaller than normal. In fact, the foam bubble analogy doesn't really do it justice because phase space is not 3 dimensional like the interior of a bottle, but gazillion dimensional. This means it is possible for the volume of the bubble containing the phase point to grow really really fast as it moves away from its initial location in phase space.

However, Penrose's observation only half answers the question. It tells us why entropy always increases in one direction but it doesn't tell us why that direction feels like the future and why the other direction feels like the past. That is the question that I am going to attempt to address here.

Liouville's Theorem

Hamilton pointed out that you can use his equations to calculate a trajectory for any point in phase space, and so work out how the corresponding system would evolve in time. Liouville was thinking along the same lines, but for regions of phase space. If you start with a blob in phase space and evolve each point in it, you end up with a new blob. The curious thing that Liouville managed to demonstrate is that the volume of the blob remains the same when you do this$^{\dagger_3}$. This surprising result tells us that phase space is in some sense a natural way to describe the space of states for a system.

Memories and other records

What is it that makes the future feel different from the past? It is that we can recollect the past but not the future. That is to say that we have memories only of the past state of the world. These are structures in our brain we can play back to regenerate images of the past. Memories are an example of a more general concept: the record. A record could be a memory, but could also be a photograph, a CD-ROM, some notes written on a scrap of paper, or bytes in a database. In each case the record contains information about the past which - with the right equipment - can be played back to answer questions about the state the world was in.

My goal is to demonstrate that records can in fact only be laid down if entropy is increasing. This then completely answers our question of why the future feels different to the past: We live in a part of the block universe in which entropy is changing with time; and any records that do exist must relate to states of the universe in which the entropy was smaller, giving us a very strong sense of "earlier" and of "later".

A Toy Universe

To illustrate let's imagine a simple universe which has a very little in it. Specifically it has two heat reservoirs connected together by a small amount of heat conducting material, and it has a computer with a gigabyte of memory. The purpose of the memory is to make records about the state of the heat reservoirs. You can imagine a record being just two numbers - representing the temperatures of each - which is appended to a long list. Now, obviously you need more than memory to make a computer, you need a CPU, a heat sink, and so forth. However, I'm going to ignore all the those components since - in this ideal computer - their macroscopic state does not change over time, unlike the states of the reservoir and the memory.

Let's assume that one heat reservoir is at 0 Celsius (273K) and the other at 100 Celcius (373K) and let's consider an interval of time $\Delta t$ over which 1 Joule is transferred between them. In the direction of time in which the heat "flows" from hot to cold entropy $S$ has increased by $\frac{1}{273}-\frac{1}{373} \approx 0.001 JK^{-1}$.

Remember that entropy is also the volume $\Omega$ of the "foam bubble" in phase space in which every point "looks alike" from the macroscopic point of view. So by $\Delta t$ the phase point is in a new "foam bubble" which is larger than it's original home by a factor of $e^{0.001/k} \approx 2^{10^{20}}$ - that's the number of configurations you can have in 12.5 billion gigabytes of computer memory - roughly all the RAM that exists in the world!

In the picture above I've shown that the original "foam bubble" in phase space maps into the larger one but does not map completely onto it. This is because - according to Liouville - phase space is an incompressible fluid! So the image is very wispy with very little density within the target bubble. In this time direction there is no reason why the gigabyte of memory could not become populated with records for the period $\Delta t$.

Now let's consider the opposite time direction. Now we are mapping from a large bubble into a smaller one (a much much smaller one):

How is this possible when Liouville's Theorem tells us that phase space is incompressible? The answer is that in fact the original larger bubble does not map into just one smaller bubble but into at least $2^{10^{20}}$ different bubbles.

Each of these bubbles represents the same macroscopic state for the reservoirs, so they must represent different states for the memory. But wait! One gigabyte of memory can only have $2^{23}$ different states which is much less than $2^{10^{20}}$. This means that only a tiny subset of the larger bubble can possibly map into any of the smaller bubbles, and - within that - the smaller bubble you end up in is random.

Let's put it another way. Suppose there's a trajectory through phase space linking a low entropy state to a high entropy state. And suppose at the low entropy end there are some records describing how the macroscopic state changes along that path. In that case it's just luck, because if you move the high entropy end by a microscopic amount, without changing its macroscopic state, then the trajectory will change, and at the new other end the records will probably no longer provide any meaningful information. You simply cannot design a reliable machine which lays down records about the macroscopic state as time progresses in the high entropy to low entropy direction!

FOOTNOTES:

$\dagger$ Actually you can use any generalised co-ordinates $q_i$ instead of $x_i$ so long as the $q_i$ and the $\dot{q_i}$ completely define the state. You end up with different $p_i$ which are known as conjugate momenta instead of just momenta, and of course a different function $H$. But Hamilton's laws still work, and phase space has the same properties.
$\dagger_2$ For simplicity I've assumed that $x_i$ and $p_i$ are the microscopic degrees of freedom and temperature and pressure are the macroscopic degrees of freedom. The actual definition of statistical entropy is phrased in terms of more general degrees of freedom.
$\dagger_3$ Proof: Let $V=\prod \Delta x_i \Delta p_i$, then $\delta V = \sum \frac{\delta \Delta x_i}{\Delta x_i}V + \frac{\delta \Delta p_i}{\Delta p_i}V$. But $\delta \Delta x_i = \delta t \Delta \dot{x_i} = \delta t \Delta x_i \frac{\partial \dot{x_i}}{\partial x_i} = \delta t \Delta x_i \frac{\partial^2 H}{\partial x_i \partial p_i}$. Do a similar calculation for $\delta \Delta p_i$ and substitute into the formula for $\delta V$ to see that everything cancels out.

Dork Scratchings

Search This Blog

Phyllotaxis and Fibonacci