historical


In my post about my short talk at CQC, I mentioned that the groupoidification program in physics is based on a few simple concepts (most research programs are, I suppose). The ones I singled out are: state, symmetry, and history. But since concepts tend to seem simpler if you leave them undefined, there are bound to be subtleties here. Recently I’ve been thinking about the first one, state. What is a state? What is this supposedly simple concept?

Etymology isn’t an especially reliable indicator of what a word means, or even the history of a concept (words change meanings, and concepts shift over time), but it’s sometimes interesting to trace. The English word “state” comes from the Latin verb stare, meaning “to stand”, whose past participle is status, which is also borrowed directly into English. The Proto-Indoeuropean root sta- also means “stand”, which in turn comes from this root, but this time via Germanic (along with “standard”). However, most of the words with this root come via various Latin intermediaries: state, stable, status, statue, stationary, station, and also substance, understand and others. The state of affairs is sometimes referred to as being “how things stand”, how they are, the current condition. Most of the words based on the sta- root imply non-motion (i.e. “stasis”). If anything, “state” (like “status”) carries this connotation less strongly than most, since the state of affairs can change – but it emphasizes how things stand now and not how they’re changing. From this sense, we also get the political meaning of “a state”, a reified version of a term originally meaning the political condition of a country (by analogy with Latin expressions like status rei publicae, the “condition of public affairs”).

So, narrowing focus now, the “state” of a physical system is the condition it’s in. In different models of physics, this is described in different ways, but in each case, by the “condition” we mean something like a complete description of all the facts about the system we can get. But this means different things in different settings. So I just want to take a look at some of them.

Think of these different settings for physics as being literally “settings” (but please excuse the pun) of the switches on a machine. Three of the switches are labelled Thermal, Quantum, and Relativistic. The “Thermal” switch varies whether or not we’re talking about thermodynamics or ordinary mechanics. The “Quantum” switch varies whether we’re talking about a quantum or classical system.

The “Relativistic” switch, which I’ll ignore for this post, specifies what kind of invariance we have: Galileian for Newton’s physics; Lorentzian for Special Relativity; general covariance for General Relativity. But this gets into dynamics, and “state” implies things are, well, static – that is, it’s about kinematics. At the very least, in Relativity, it’s not canonical what you mean by “now”, and so the definition of a state must include choosing a reference frame (in SR), or a Cauchy hypersurface (in GR). So let’s gloss over that for now.

When all these switches are in the “off” position, we have classical mechanics. Here, we think of a state as – at a first level of approximation, an element of a set. Now, for serious classical mechanics, this set will be a symplectic manifold, like the cotangent bundle T^*M of some manifold M. This is actually a bit subtle already, since a point in T^*M represents a collection of positions and momenta (or some generalization thereof): that is, we can start with a space of “static” configurations, parametrized by the values of some observable quantities, but a state (contrary to what etymology suggests) also includes momenta describing how those quantities are changing with time (which, in classical mechanics, is a fairly unproblematic notion).

The Hamiltonian picture of the dynamics of the system then tells us: given its state, what will be the acceleration, which we can then use to calculate states at future time. This requires a Hamiltonian, H, which we think of as the energy, which can be calculated from the state. So, for example, kinetic plus potential energy: in the case of a particle moving in a potential on a line, H = K + V = p^2/m + V(q). The space of states can be described without much reference to the Hamiltonian, but once we have H, we get a flow on that space, transforming old states into new states with time.

Now if we turn on the “Thermal” switch, we have a different notion of state. The standard image for the classical mechanical system is that we may be talking about a particle, or a few particles, or perhaps a rigid object, moving in space, maybe subject to some constraints. In thermodynamics, we are thinking of a statistical ensemble of objects – in the simplest case, N identical objects – and want to ask how energy is distributed among them. The standard image is of a box full of gas at some temperature: it’s full of molecules, each with its own trajectory, and they interact through collisions and exchange energy and momentum. Rather than tracking the exact positions of molecules, in thermodynamics a “state” is a distribution, or more precisely a probability measure, on the space of such states. We don’t assume we know the detailed microstate of the system – the positions and momenta of all the particles in the gas – but only something about how these are distributed among them. This reflects the real fact that we can only measure things like pressure, temperature, etc. The measure is telling us the proportion of particles with positions and momenta in a given range.

This is a big difference for something described by the same word “state”. Even assuming our underlying space of “microstates” is still the same T^*M, the state is no longer a point. One way to interpret the difference is that here the state is something epistemic. It describes what we know about the system, rather than everything about it. The measure answers the question: “given what we know, what is the likelihood the system is in microstate X?” for each X. Now, of course, we could take a space of all such measures: given our previous classical system, it’s a space of functionals on C(T^*M). Then the state can again be seen as an element of a set. But it’s more natural to keep in view its nature as a measure, or, if it’s nice enough, as a positive function on the space of states. (It’s interesting that this is an object of the same type as the Hamiltonian – this is, intuitively, the basis of what Carlo Rovelli calls the “Thermal Time Hypothesis”, summarized here, which is secretly why I wanted to write on this topic. But more on that in a later post. For one thing, before I can talk about it, I have to talk about what comes next.)

Now turn off the “Thermal” switch, and think about the “Quantum” switch. Here there are a couple of points of view.

To begin with, we describe a system in terms of a Hilbert space, and a state is a vector in a Hilbert space. Again, this could be described as an element of a set, but the complex linear structure is important, so we keep thinking of it as fundamental to the type of a state. In geometric quantization, one often starts with a classical system with a state space like T^*M = X, and then takes the Hilbert space \mathcal{H}=L^2(X), so that a state is (modulo analysis issues) basically a complex-valued function on X. This is something like the (positive real-valued) measure which gives a thermodynamic state, but the interpretation is trickier. Of course, if \mathcal{H} is an L^2-space, we can recover a probability measure, since the square modulus of \phi \in \mathcal{H} has finite total measure (so we can normalize it). But this isn’t enough to describe \phi, and the extra information of phases goes missing. In any case, the probability measure no longer has the obvious interpretation of describing the statistics of a whole ensemble of identical systems – only the likelihood of measuring particular values for one system in the state \phi. (In fact, there are various no-go theorems getting in the way of a probablity interpretation of \phi, though this again involves dynamics – a recurring theme is that it’s hard to reason sensibly about states without dynamics). So despite some similarity, this concept of “state” is very different, and phase is a key part of how it’s different. I’ll be jiggered if I can say why, though: most of the “huh?” factor in quantum mechanics lives right about here.

Another way to describe the state of a quantum system is related to this probability, though. The inner product of \mathcal{H} (whether we found it as an L^2-space or not) gives a way to talk about statistics of the system under repeated observations. Observables, which for the classical picture are described by functions on the state space X, are now self-adjoint operators on \mathcal{H}. The expectation value for an observable A in the state \phi is $\langle \phi | A | \phi \rangle$ (note that the Dirac notation implicitly uses self-adjointness of A). So the state has another, intuitively easier, interpretation: it’s a real-valued functional on observables, namely the one I just described.

The observables live in the algebra \mathcal{A} = \mathcal{B}(\mathcal{H}) of bounded operators on \mathcal{H}. Setting both Thermal and Quantum switches of our notion of “state” gives quantum statistical mechanics. Here, the “C*-algebra” (or von Neumann-algebra) picture of quantum mechanics says that really it’s the algebra \mathcal{A} that’s fundamental – it corresponds to actual operations we can perform on the system. Some of them (the self-adjoint ones) represent really very intuitive things, namely observables, which are tangible, measurable quantities. In this picture, \mathcal{H} isn’t assumed to start with at all – but when it is, the kind of object we’re dealing with is a density matrix. This is (roughly) a positive operator on \mathcal{H} of unit trace). In general a state on a von Neumann algebra is a linear functional with unit trace.

This is analogous to the view of a state as a probability measure (positive function with unit total integral) in the classical realm: if an observable is a function on states (giving the value of that observable in each state), then a measure is indeed a functional on the space of observables. A probability measure, in fact, is the functional giving the expectation value of the observable. (And, since variance and all the higher moments of the probability distribution for that observable are themselves defined as expectation values, it also tells us all of those.)

On the other hand, the Gelfand-Naimark-Segal theorem says that, given a state \phi : \mathcal{A} \rightarrow \mathbb{R}, there’s a representation of \mathcal{A} as an algebra of operators on some Hilbert space, and a vector v for which this \phi is just \phi(A) = \langle v | A | v \rangle. This is the GNS representation (and in fact it’s built by taking the regular representation of \mathcal{A} on itself by multiplication, with \mathcal{A} made into a Hilbert space by definining the inner product to make this property work, and with v = 1). So the view here is that a state is some kind of operation on observables – a much more epistemic view of things. So although the GNS theorem relates this to the vector-in-Hilbert-space view of “state”, they are quite different conceptually. (For one thing, the GNS representation is giving a different Hilbert space for each state, which undermines the sense that the space of ALL states is fundamentally “there”, but in both pictures \mathcal{A} is the same for all states.)

(This von Neumann-algebra point of view, by the way, gets along nicely with the 2-Hilbert space lens for looking at quantum mechanics, which may partly bridges the gap between it and the Hilbert-space view. The category of representations of a von Neumann algebra is a 2-Hilbert space. A “2-vector” (or “2-state”, if you like) in this category is a representation of the algebra. So the GNS representation itself is a “2-state”. This raises the question about 2-algebras of 2-operators, and John Baez’ question: “What is the categorified GNS theorem?” But let’s leave 2-states for later along with the rest.)

So where does this leave us regarding the meaning of “state”? The classical view is that a state is an element of some (structured) set. The usual quantum picture is that a state is, depending on how precise you want to be, either a vector in a Hilbert space, or a 1-d subspace of that Hilbert space – that is, a point in the projective Hilbert space. What these two views have in common is that there is some space of all “possible worlds”, i.e. of all ways things can be in the system being studied. A state is then a way of selecting one of these. The difference is in what this space of possible worlds is like – that is, which category it lives in – and how exactly one “selects” a state. How they differ is in the possibility of taking combinations of states. As for selecting states, Sets is a Cartesian category, with a terminal object 1 = {*}: an element of a set is a map from 1 into it. Hilb is a monoidal category, but not Cartesian: selecting a single vector has no obvious categorical equivalent, though selecting a 1-D subspace amounts to a map from \mathbb{C} (up to isomorphism). So the model of an “element” isn’t a singleton, it’s the complex line – and it relates to other possible spaces differently: not as a terminal object, but as a monoidal unit. This is a categorical way of saying how the idea of “state” is structurally different.

The thermal point of view is a little more epistemically subtle: for both classical and quantum pictures, it’s best thought of as, not a possible world, but a function acting on observables (that is, conditions of knowledge). In the classical picture, this is directly related to a space of possible worlds – it’s a measure on it, which we can think of as saying how a large ensemble of systems are distributed in that space. In the quantum picture, in some ways the most (epistemically) natural view, in terms of von Neumann algebras, breaks the connection to this notion of “possible worlds” altogether, since \mathcal{A} has representations on many different Hilbert spaces?

So a philosophical question is: what do these different concepts have in common that lets us use them all to represent the “same” root idea? Without actually answering this, I’ll just mention that at some point I’d like to talk a bit about “2-states” as 2-vectors, and in general how to categorify everything above.

A couple of posts ago, I mentioned Max Jammer’s book “Concepts of Space” as a nice genealogy of that concept, with one shortcoming from my point of view – namely, as the subtitle suggests, it’s a “History of Theories of Space in Physics”, and since physics tends to use concepts out of mathematics, it lags a bit – at least as regards fundamental concepts. Riemannian geometry predates Einstein’s use of it in General Relativity by fifty some years, for example. Heisenberg reinvented matrices and matrix multiplication (which eventually led to wholesale importation of group theory and representation theory into physics). More examples no doubt could be found (String Theory purports to be a counterexample, though opinions differ as to whether it is real physics, or “merely” important mathematics; until it starts interacting with experiments, I’m inclined to the latter, though of course contra Hardy, all important mathematics eventually becomes useful for something).

What I said was that it would be nice to see further investigation of concepts of space within mathematics, in particular Grothendieck’s and Connes’. Well, in a different context I was referred to this survey paper by Pierre Cartier from a few years back, “A Mad Day’s Work: From Grothendieck To Connes And Kontsevich, The Evolution Of Concepts Of Space And Symmetry”, which does at least some of that – it’s a fairly big-picture review that touches on the relationship between these new ideas of space. It follows that stream of the story of space up to the end of the 20th century or so.

There’s also a little historical/biographical note on Alexander Grothendieck – the historical context is nice to see (one of the appealing things about Jammer’s book). In this case, much of the interesting detail is more relevant if you find recent European political history interesting – but I do, so that’s okay. In fact, I think it’s helpful – maybe not mathematically, but in other ways – to understand the development of mathematical ideas in the context of history. This view seems to be better received the more ancient the history in question.

On the scientific end, Cartier tries to explain Grothendieck’s point of view of space – in particular what we now call  topos theory – and how it developed, as well as how it relates to Connes’.  Pleasantly enough, a key link between them turns out to be groupoids!  However, I’ll pass on commenting on that at the moment.

Instead, let me take a bit of a tangent and jump back to Jammer’s book.  I’ll tell you something from his chapter “Emancipation from Aristotelianism” which I found intriguing.  This would be an atomistic theory of space – an idea that’s now beginning to make something of a comeback, in the guise of some of the efforts toward a quantum theory of gravity (EDIT: but see comments below).  Loop quantum gravity, for example, deals with space in terms of observables, which happen to take the form of holonomies of connections around loops.  Some of these observables have interpretations in terms of lengths, areas, and volumes.  It’s a prediction of LQG that these measurements should have “quantized”, which is to say integer, values: states of LQG are “spin networks”, which is to say graphs with (quantized) labels on the edges, interpreted as areas (in a dual cell complex).  (Notice this is yet again another, different, view of space, different from Grothendieck’s or Connes’, but shares with Connes especially the idea of probing space in some empirical way.  Grothendieck “probes” space mainly via cohomology – how “empirical” that is depends on your point of view.)

The atomistic theory of space Jammer talks about is very different, but it does also come from trying to reconcile a discrete “quantum” theory of matter with a theory linking matter to space.  In particular, the medieval Muslim philosophical school known as al Kalam tried to reconcile the Koran and Islamic theology with Greek philosophy (most of the “Hellenistic” world conquered by Alexander the Great, not least Egypt, is inside Dar al Islam, which is why many important Greek texts came into Europe via Arabic translations).  Though they were, as Jammer says, “Emancipating” themselves from Aristotle, they did share some of his ideas about space.

For Aristotle, space meant “place” – the answer to the questions “where is it?” and “what is its shape and size?”. In particular, it was first and foremost an attribute of some substance.  All “where?” questions are about some THING.  The answer is defined in terms of other things: my cat is on the ground, under the tree, beside the house.  The “place” of an object was literally the inner shell of the containing body that held it (which was contained by some other body, and so on – there being no vacuum in Aristotle).  So my “place” is defined by (depending how you look at it) my skin, my clothes, or the walls of the room I’m in.  This is a relational view of space, though more hard-headed than, say, Leibniz’s.

The philosophers of the Kalam had a similar relational view of space, but they didn’t accept Aristotle’s view of “substances”, where each thing has its own essential identity, on which attributes are hung like hats.  Instead, they believed in atomism, following Democritus and Leucippus: bodies were made out of little indivisible nuggets called “atoms”.  Macroscopic things were composites of atoms, and their attributes resulted from how the atoms were put together.  Here’s Jammer’s description:

The atoms of the Kalam are indivisible particles, equal to each other and devoid of all extension.  Spatial magnitude can be attributed only to a combination of atoms forming a body.  Although a definite position (hayyiz) belongs to each individual atom, it does not occupy space (makan).  It is rather the set of these positions – one is almost tempted to say, the system of relations – that constitutes spatial extension….

In the Kalam, these rather complicated and surprisingly abstract ideas were deemed necessary in order to meet Aristotle’s objections against atomism on the ground that a spatial continuum cannot be constituted by, or resolved into, indivisibles nor can two points be continuous or contiguous with one another.

So like people who prefer a “background independent” quantum theory of gravity, they wanted to believe that space (geometry) derives from matter, and that matter is discrete, but space was commonly held to be continuous.  Also alike, they resolved the problem by discarding the assumption of continuous space, and, by consideration of motion, to discrete time.

There are some differences, though.  The most obvious is that the nodes of the graph in a spin network state don’t represent units of matter, or “atoms”.  For that matter, quantum field theory doesn’t really have “atoms” in the sense of indivisible units which don’t break apart or interact.  Everything interacts in QFT.  (In some sense, interactions are more fundamental units in QFT than “particles” are – particles only (sic!) serve to connect one interaction with another.)

Another key difference is how space relates to matter.  In Aristotle, and in the Kalam, space is defined directly by matter: two bits of matter “define” the space between them.  In General Relativity (the modern theory with the “relational” view of space), there’s still room for space as an actor in its own right, like Newton’s absolute space-as-independent-variable – in other words, room for a vacuum, which Aristotle categorically denied could even conceivably exist.  In GR, what matter determines is the curvature of space (more precisely the Einstein tensor of the curvature).

Well, so the differences are probably more informative than the similarities,

(Edit: To emphasize a key difference glossed over before…  It was coupling to quantum matter which suggested quantizing the picture of space.  Discreteness of the spectrum of various observables is a logically separate prediction in each case.  Either matter or space(time) could have had continuous spectrum for the relevant observables and still been quantized – discrete matter would have given discreteness for some observed quantities, but not area, length, and so on.  So in the modern setting, the link is much less direct.)

 but the fact that theories of related discreteness in matter, space, and time, have been around for a thousand years or more is intriguing.  The idea of empty space as an independent entity – in the modern form only about three hundred years old – appears to be the real novel part.  One of the nice intuitions in Carlo Rovelli’s book on Quantum Gravity, for me at least, was to say that, rather than there being a separate “space”, we have a theory of fields defined on other fields as background – one of which, the “gravitational field” has customarily been taken for “space”.  So spatial geometry is a field, and it has some propagating (through space!) degrees of freedom – the particle associated to this field is a graviton.  Nobody’s ever seen one, mind you – but supposing they exist makes many of things easier.

To re-state a previous point: I think this is a nice aspect of categorification for dealing with space.  Extending the “stuff/structure/properties” trichotomy to allow space to resemble both “stuff” and relations between stuff leaves room for both points of view.

I mention this because tomorrow I leave London (Ontario) for London (England), and thence to Nottingham, for the Quantum Gravity and Quantum Geometry Conference.  It’s been a while since I worked much on quantum gravity, per se, but this conference should be interesting because it seems to be a confluence of mathematically and physically inclined people, as the name suggests.  I read on the program, for example, that Jerzy Lewandowski is speaking on QFT in Quantum Curved Spacetime, and suddenly remember that, oh yes, I did a Masters thesis (viz) on QFT in curved (classical) spacetime… but that was back in the 20th century!

It’s been a while, and I only made a small start at it before, but that whole area of physics is quite pretty.  Anyway, it should be interesting, and there are a number of people I’m looking forward to talking to.