When I made my previous two posts about ideas of “state”, one thing I was aiming at was to say something about the relationships between states and dynamics. The point here is that, although the idea of “state” is that it is intrinsically something like a snapshot capturing how things are at one instant in “time” (whatever that is), extrinsically, there’s more to the story. The “kinematics” of a physical theory consists of its collection of possible states. The “dynamics” consists of the regularities in how states change with time. Part of the point here is that these aren’t totally separate.

Just for one thing, in classical mechanics, the “state” includes time-derivatives of the quantities you know, and the dynamical laws tell you something about the second derivatives. This is true in both the Hamiltonian and Lagrangian formalism of dynamics. The Hamiltonian function, which represents the concept of “energy” in the context of a system, is based on a function H(q,p), where q is a vector representing the values of some collection of variables describing the system (generalized position variables, in some configuration space X), and the p = m \dot{q} are corresponding “momentum” variables, which are the other coordinates in a phase space which in simple cases is just the cotangent bundle T*X. Here, m refers to mass, or some equivalent. The familiar case of a moving point particle has “energy = kinetic + potential”, or H = p^2 / m + V(q) for some potential function V. The symplectic form on T*X can then be used to define a path through any point, which describes the evolution of the system in time – notably, it conserves the energy H. Then there’s the Lagrangian, which defines the “action” associated to a path, which comes from integrating some function L(q, \dot{q}) living on the tangent bundle TX, over the path. The physically realized paths (classically) are critical points of the action, with respect to variations of the path.

This is all based on the view of a “state” as an element of a set (which happens to be a symplectic manifold like T*X or just a manifold if it’s TX), and both the “energy” and the “action” are some kind of function on this set. A little extra structure (symplectic form, or measure on path space) turns these functions into a notion of dynamics. Now a function on the space of states is what an observable is: energy certainly is easy to envision this way, and action (though harder to define intuitively) counts as well.

But another view of states which I mentioned in that first post is the one that pertains to statistical mechanics, in which a state is actually a statisticial distribution on the set of “pure” states. This is rather like a function – it’s slightly more general, since a distribution can have point-masses, but any function gives a distribution if there’s a fixed measure d\mu around to integrate against – then a function like H becomes the measure H d\mu. And this is where the notion of a Gibbs state comes from, though it’s slightly trickier. The idea is that the Gibbs state (in some circumstances called the Boltzmann distribution) is the state a system will end up in if it’s allowed to “thermalize” – it’s the maximum-entropy distribution for a given amount of energy in the specified system, at a given temperature T. So, for instance, for a gas in a box, this describes how, at a given temperature, the kinetic energies of the particles are (probably) distributed. Up to a bunch of constants of proportionality, one expects that the weight given to a state (or region in state space) is just exp(-H/T), where H is the Hamiltonian (energy) for that state. That is, the likelihood of being in a state is inversely proportional to the exponential of its energy – and higher temperature makes higher energy states more likely.

Now part of the point here is that, if you know the Gibbs state at temperature T, you can work out the Hamiltonian
just by taking a logarithm – so specifying a Hamiltonian and specifying the corresponding Gibbs state are completely equivalent. But specifying a Hamiltonian (given some other structure) completely determines the dynamics of the system.

This is the classical version of the idea Carlo Rovelli calls “Thermal Time”, which I first encountered in his book “Quantum Gravity”, but also is summarized in Rovelli’s FQXi essay “Forget Time“, and described in more detail in this paper by Rovelli and Alain Connes. Mathematically, this involves the Tomita flow on von Neumann algebras (which Connes used to great effect in his work on the classification of same). It was reading “Forget Time” which originally got me thinking about making the series of posts about different notions of state.

Physically, remember, these are von Neumann algebras of operators on a quantum system, the self-adjoint ones being observables; states are linear functionals on such algebras. The equivalent of a Gibbs state – a thermal equilibrium state – is called a KMS (Kubo-Martin-Schwinger) state (for a particular Hamiltonian). It’s important that the KMS state depends on the Hamiltonian, which is to say the dynamics and the notion of time with respect to which the system will evolve. Given a notion of time flow, there is a notion of KMS state.

One interesting place where KMS states come up is in (general) relativistic thermodynamics. In particular, the effect called the Unruh Effect is an example (here I’m referencing Robert Wald’s book, “Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics”). Physically, the Unruh effect says the following. Suppose you’re in flat spacetime (described by Minkowski space), and an inertial (unaccelerated) observer sees it in a vacuum. Then an accelerated observer will see space as full of a bath of particles at some temperature related to the acceleration. Mathematically, a change of coordinates (acceleration) implies there’s a one-parameter family of automorphisms of the von Neumann algebra which describes the quantum field for particles. There’s also a (trivial) family for the unaccelerated observer, since the coordinate system is not changing. The Unruh effect in this language is the fact that a vacuum state relative to the time-flow for an unaccelerated observer is a KMS state relative to the time-flow for the accelerated observer (at some temperature related to the acceleration).

The KMS state for a von Neumann algebra with a given Hamiltonian operator has a density matrix \omega, which is again, up to some constant factors, just the exponential of the Hamiltonian operator. (For pure states, \omega = |\Psi \rangle \langle \Psi |, and in general a matrix becomes a state by \omega(A) = Tr(A \omega) which for pure states is just the usual expectation value value for A, \langle \Psi | A | \Psi \rangle).

Now, things are a bit more complicated in the von Neumann algebra picture than the classical picture, but Tomita-Takesaki theory tells us that as in the classical world, the correspondence between dynamics and KMS states goes both ways: there is a flow – the Tomita flow – associated to any given state, with respect to which the state is a KMS state. By “flow” here, I mean a one-parameter family of automorphisms of the von Neumann algebra. In the Heisenberg formalism for quantum mechanics, this is just what time is (i.e. states remain the same, but the algebra of observables is deformed with time). The way you find it is as follows (and why this is right involves some operator algebra I find a bit mysterious):

First, get the algebra \mathcal{A} acting on a Hilbert space H, with a cyclic vector \Psi (i.e. such that \mathcal{A} \Psi is dense in H – one way to get this is by the GNS representation, so that the state \omega just acts on an operator A by the expectation value at \Psi, as above, so that the vector \Psi is standing in, in the Hilbert space picture, for the state \omega). Then one can define an operator S by the fact that, for any A \in \mathcal{A}, one has

(SA)\Psi = A^{\star}\Psi

That is, S acts like the conjugation operation on operators at \Psi, which is enough to define S since \Psi is cyclic. This S has a polar decomposition (analogous for operators to the polar form for complex numbers) of S = J \Delta, where J is antiunitary (this is conjugation, after all) and \Delta is self-adjoint. We need the self-adjoint part, because the Tomita flow is a one-parameter family of automorphisms given by:

\alpha_t(A) = \Delta^{-it} A \Delta^{it}

An important fact for Connes’ classification of von Neumann algebras is that the Tomita flow is basically unique – that is, it’s unique up to an inner automorphism (i.e. a conjugation by some unitary operator – so in particular, if we’re talking about a relativistic physical theory, a change of coordinates giving a different t parameter would be an example). So while there are different flows, they’re all “essentially” the same. There’s a unique notion of time flow if we reduce the algebra \mathcal{A} to its cosets modulo inner automorphism. Now, in some cases, the Tomita flow consists entirely of inner automorphisms, and this reduction makes it disappear entirely (this happens in the finite-dimensional case, for instance). But in the general case this doesn’t happen, and the Connes-Rovelli paper summarizes this by saying that von Neumann algebras are “intrinsically dynamic objects”. So this is one interesting thing about the quantum view of states: there is a somewhat canonical notion of dynamics present just by virtue of the way states are described. In the classical world, this isn’t the case.

Now, Rovelli’s “Thermal Time” hypothesis is, basically, that the notion of time is a state-dependent one: instead of an independent variable, with respect to which other variables change, quantum mechanics (per Rovelli) makes predictions about correlations between different observed variables. More precisely, the hypothesis is that, given that we observe the world in some state, the right notion of time should just be the Tomita flow for that state. They claim that checking this for certain cosmological models, like the Friedman model, they get the usual notion of time flow. I have to admit, I have trouble grokking this idea as fundamental physics, because it seems like it’s implying that the universe (or any system in it we look at) is always, a priori, in thermal equilibrium, which seems wrong to me since it evidently isn’t. The Friedman model does assume an expanding universe in thermal equilibrium, but clearly we’re not in exactly that world. On the other hand, the Tomita flow is definitely there in the von Neumann algebra view of quantum mechanics and states, so possibly I’m misinterpreting the nature of the claim. Also, as applied to quantum gravity, a “state” perhaps should be read as a state for the whole spacetime geometry of the universe – which is presumably static – and then the apparent “time change” would then be a result of the Tomita flow on operators describing actual physical observables. But on this view, I’m not sure how to understand “thermal equilibrium”.  So in the end, I don’t really know how to take the “Thermal Time Hypothesis” as physics.

In any case, the idea that the right notion of time should be state-dependent does make some intuitive sense. The only physically, empirically accessible referent for time is “what a clock measures”: in other words, there is some chosen system which we refer to whenever we say we’re “measuring time”. Different choices of system (that is, different clocks) will give different readings even if they happen to be moving together in an inertial frame – atomic clocks sitting side by side will still gradually drift out of sync. Even if “the system” means the whole universe, or just the gravitational field, clearly the notion of time even in General Relativity depends on the state of this system. If there is a non-state-dependent “god’s-eye view” of which variable is time, we don’t have empirical access to it. So while I can’t really assess this idea confidently, it does seem to be getting at something important.

Last Friday, UWO hosted a Distinguished Colloquium talk by Gregory Chaitin, who was talking about a proposal for a new field he calls “metabiology”, which he defined in the talk (and on the website above) as “a field parallel to biology, dealing with the random evolution of artificial software (computer programs) rather than natural software (DNA), and simple enough that it is possible to prove rigorous theorems or formulate heuristic arguments at the same high level of precision that is common in theoretical physics.” This field doesn’t really exist to date, but his talk was intended to argue that it should, and to suggest some ideas as to what it might look like. It was a well-attended talk with an interdisciplinary audience including (at least) people from the departments of mathematics, computer science, and biology. As you might expect for such a talk, it was also fairly nontechnical.

A lot of the ideas presented in the talk overlapped with those in this outline, but to summarize… One of the motivating ideas that he put forth was that there is currently no rigorous proof that Darwin-style biological evolution can work – i.e. that operations of mutation and natural selection can produce systems of very high complexity. This is a fundamental notion in biology, summarized by the slogan, “Nothing in biology makes sense except in light of evolution”. This phrase, funnily, was coined as the title of a defense of a “theistic evolution” – not obviously a majority position among scientists, but also not to be confused with “intelligent design” which claims that evolution can’t account for observed features of organisms. This is a touchy political issue in some countries, and it’s not obvious that a formal proof that mutation and selection CAN produce highly complex forms would resolve it. Even so, as Chaitin said, it seems likely that such a proof could exist – but if there’s a rigorous proof of the contrary, that would be good to know also!

Of course, such a formal proof doesn’t exist because formal proof doesn’t play much role in biology, or any other empirical science – since living things are very complex, and incompletely understood. Thus the proposal of a different field, “metabiology”, which would study simpler formal objects: “artificial software” in the form of Turing machines or program code, as opposed to “natural software” like DNA. This abstracts away everything about an organism except its genes (which is a lot!), with the aim of simplifying enough to prove that mutation and selection in this toy world can generate arbitrarily high levels of complexity.

Actually stating this precisely enough to prove ties in to the work that Chaitin is better known for, namely the study of algorithmic complexity and theoretical computer science. The two theorems Chaitin stated (but didn’t prove in the talk) did not – he admitted – really meet that goal, but perhaps did point in that direction. One measure of complexity is computability – that is, the size of a Turing machine (for example, though a similar definition applies to other universal ways of describing algorithms) which is needed to generate a particular pattern. A standard example is the “Busy Beaver function“, and one way to define
it is to say that B(n) is the largest number printed out by an n-state Turing machine which then halts. Since the halting problem is uncomputable (i.e. there’s no Turing machine which, given a description of another machine, can always decide whether or not it halts), for reasons analogous to Cantor’s diagonal argument or Godel’s incompleteness theorem, generating B(n), or a sequence of the same order, is a good task to measure complexity.

So the first toy model involved a single organism, being replaced in each generation by a mutant form. The “organism” is a Turing machine (or a program in some language, etc. – one key result from complexity theory is that all these different ways to specify an algorithm can simulate each other, with the addition of at most a fixed-size prefix, which is the part of the algorithm describing how to do the simulation). In each generation, it is mutated. The mutant replaces the original organism if: (a) the new code halts, and (b) outputs a number which (c) is larger than the number produced by the original. Now, this decision procedure is uncomputable since it requires solving the halting problem – so in particular, there’s no way to simulate this process. But the theorem says that, in exponential time (i.e. t(n) \sim O(e^n)), this process will produce a machine which produces a number of order B(n). That is, as long as the “environment” (the thing doing the selection) can recognize and reward complexity, mutation is sufficient to produce it. But these are pretty big assumptions, which is one reason this theorem isn’t quite what’s wanted.

Still, within it’s limited domain, he also stated a theorem to the effect that, for any given level of complexity (in the above sense), there is a path through the space of possible programs which reaches it, such that the “mutation distance” (roughly, the negative logarithm of the probability of a mutation occurring) at each step is bounded, and the complexity (therefore fitness, in this toy model) increases at each step. He indicated that one could prove this using the bits of the halting probability Omega – he didn’t specify how, and this isn’t something I’m very familiar with, but apparently (as describeded in the linked article), there are somewhat standard ways to do this kind of thing.

So anyway, this little toy model doesn’t really do the job Chaitin is saying ought to be done, but it illustrates what the kind of theorems he’s asking for might look like. My reaction is that it would be great to have theorems like this that could tell us something meaningful about real biology (so the toy model certainly is too simple), though I’m not totally convinced there needs to be a “new field” for such study. But certainly theoretical biology seems to be much less developed than, say, theoretical physics, and even if rigorous proofs aren’t going to be as prominent there, if some can be found, it probably couldn’t hurt.

After the talk, there was some interesting discussion about other things going on in theoretical biology and “systems biology“.  Chaitin commented that a lot of the work in this field involves detailed simulations of models of real systems, made as accurate as possible – which, while important, is different from the kind of pursuit of basic theoretical principles he was talking about.  So this would include things like: modeling protein folding; studying patterns in big databases of gene frequencies in populations and how they change in time; biophysical modeling of organs and the biochemical reactions in them; simulating the dynamics of individual cells, their membranes and the molecular machinery that makes them work; and so on.  All of which has been moving rapidly in recent years,  but is only tangentially related to fundamental principles about how life works.

On the other hand, as audience members pointed out, there is another thread, exemplified by the Santa Fe Institute, which is more focused on understanding the dynamics of complex systems.  Some well-known names in this area would be Stuart Kauffman, John Holland and Per Bak, among others.  I’ve only looked into this stuff at the popular level, but there are some interesting books about their work – Holland’s “Hidden Order”, Kauffman’s “The Origins of Order” (more technical) and “At Home in the Universe” (more popular), and Solé and Goodwin’s “Signs of Life” (a popular survey, but with equations, of various
aspects of mathematical approaches to biological complexity).  Chaitin’s main comment on this stuff is that it has produced plenty of convincing heuristic arguments, simulations and models with suggestive behaviour, and so on – but not many rigorous theorems.  So: it’s good, but not exactly what he meant by “metabiology”.

Summarizing this stuff would be a big task in itself, but it does connect to Chaitin’s point that it might be nice to know (rigorously) if Darwinian evolution by itself were NOT enough to explain the complexity of living things.  Stuart Kauffman, for example, has suggested that certain kinds of complex order tend to arise through “self-organization”.  Philosopher Daniel Dennett
commented on this in “Darwin’s Dangerous Idea”, saying that although this might be true, at most it tells us more detail about what kinds of things Darwinian selection has available to act on.

This all seems to tie into the question over which appeared first as life was first coming into being: self-replicating molecules like RNA (and later DNA), or cells with metabolic reactions occurring inside.  Organisms obviously both reproduce and metabolize, but these are two quite different kinds of process, and there seems to be a “chicken-and-egg” problem with which came first.  Kauffman, among others, has looked at the emergence of “autocatalytic networks” of chemical reactions: these are collections of chemical reactions, some or all of which needing a catalyst, such that all the catalysts needed to make them run are products of some reaction in the network.  They’ve shown in simulation that such networks can arise spontaneously under certain conditions – suggesting that metabolism might have come into existence without DNA or similar molecules around (one also thinks of larger phenomena, like the nitrogen cycle).  In any case, this is the kind of thing which people sometimes point to when suggesting that Darwinian selection isn’t enough to completely explain the structure of organisms actually existing today.  Which is a different claim (mind you) than the claim that Darwinian evolution could not possibly produce complex organisms.  Chaitin’s whole motivation was to suggest that it should be provable one way or the other (and, he presumes, in the affirmative) whether mutation and selection CAN do this job.  If it could be proved that it can’t – at least there are some other ingredients to consider.

All in all, I found the talk thought-provoking, in spite (or because) of being partial and inconclusive.  Biology may be less rigorous than physics, but this could just be a sign that there’s a lot to learn and do in the field – and a lot of it is being done!

I just posted the slides for “Groupoidification and 2-Linearization”, the colloquium talk I gave at Dalhousie when I was up in Halifax last week. I also gave a seminar talk in which I described the quantum harmonic oscillator and extended TQFT as examples of these processes, which covered similar stuff to the examples in a talk I gave at Ottawa, as well as some more categorical details.

Now, in the previous post, I was talking about different notions of the “state” of a system – all of which are in some sense “dual to observables”, although exactly what sense depends on which notion you’re looking at. Each concept has its own particular “type” of thing which represents a state: an element-of-a-set, a function-on-a-set, a vector-in-(projective)-Hilbert-space, and a functional-on-operators. In light of the above slides, I wanted to continue with this little bestiary of ontologies for “states” and mention the versions suggested by groupoidification.

State as Generalized Stuff Type

This is what groupoidification introduces: the idea of a state in Span(Gpd). As I said in the previous post, the key concepts behind this program are state, symmetry, and history. “State” is in some sense a logical primitive here – given a bunch of “pure” states for a system (in the harmonic oscillator, you use the nonnegative integers, representing n-photon energy states of the oscillator), and their local symmetries (the n-particle state is acted on by the permutation group on n elements), one defines a groupoid.

So at a first approximation, this is like the “element of a set” picture of state, except that I’m now taking a groupoid instead of a set. In a more general language, we might prefer to say we’re talking about a stack, which we can think of as a groupoid up to some kind of equivalence, specifically Morita equivalence. But in any case, the image is still that a state is an object in the groupoid, or point in the stack which is just generalizing an element of a set or point in configuration space.

However, what is an “element” of a set S? It’s a map into S from the terminal element in \mathbf{Sets}, which is “the” one-element set – or, likewise, in \mathbf{Gpd}, from the terminal groupoid, which has only one object and its identity morphism. However, this is a category where the arrows are set maps. When we introduce the idea of a “history “, we’re moving into a category where the arrows are spans, A \stackrel{s}{\leftarrow} X \stackrel{t}{\rightarrow} B (which by abuse of notation sometimes gets called X but more formally (X,s,t)). A span represents a set/groupoid/stack of histories, with source and target maps into the sets/groupoids/stacks of states of the system at the beginning and end of the process represented by X.

Then we don’t have a terminal object anymore, but the same object 1 is still around – only the morphisms in and out are different. Its new special property is that it’s a monoidal unit. So now a map from the monoidal unit is a span 1 \stackrel{!}{\rightarrow} X \stackrel{\Phi}{\rightarrow} B. Since the map on the left is unique, by definition of “terminal”, this really just given by the functor \Phi, the target map. This is a fibration over B, called here \Phi for “phi”-bration, but this is appropriate, since it corresponds to what’s usually thought of as a wavefunction \phi.

This correspondence is what groupoidification is all about – it has to do with taking the groupoid cardinality of fibres, where a “phi”bre of \Phi is the essential preimage of an object b \in B – everything whose image is isomorphic to b. This gives an equivariant function on B – really a function of isomorphism classes. (If we were being crude about the symmetries, it would be a function on the quotient space – which is often what you see in real mechanics, when configuration spaces are given by quotients by the action of some symmetry group).

In the case where B is the groupoid of finite sets and bijections (sometimes called \mathbf{FinSet_0}), these fibrations are the “stuff types” of Baez and Dolan. This is a groupoid with something of a notion of “underlying set” – although a forgetful functor U: C \rightarrow \mathbf{FinSet_0} (giving “underlying sets” for objects in a category C) is really supposed to be faithful (so that C-morphisms are determined by their underlying set map). In a fibration, we don’t necessarily have this. The special case corresponds to “structure types” (or combinatorial species), where X is a groupoid of “structured sets”, with an underlying set functor (actually, species are usually described in terms of the reverse, fibre-selecting functor \mathbf{FinSet_0} \rightarrow \mathbf{Sets}, where the image of a finite set consists of the set of all “$\Phi$-structured” sets (such as: “graphs on set S“, or “trees on S“, etc.) The fibres of a stuff type are sets equipped with “stuff”, which may have its own nontrivial morphisms (for example, we could have the groupoid of pairs of sets, and the “underlying” functor \Phi selects the first one).

Over a general groupoid, we have a similar picture, but instead of having an underlying finite set, we just have an “underlying B-object”. These generalized stuff types are “states” for a system with a configuration groupoid, in Span(\mathbf{Gpd}). Notice that the notion of “state” here really depends on what the arrows in the category of states are – histories (i.e. spans), or just plain maps.

Intuitively, such a state is some kind of “ensemble”, in statistical or quantum jargon. It says the state of affairs is some jumble of many configurations (which we apparently should see as histories starting from the vacuous unit 1), each of which has some “underlying” pure state (such as energy level, or what-have-you). The cardinality operation turns this into a linear combination of pure states by defining weights for each configuration in the ensemble collected in X.

2-State as Representation

A linear combination of pure states is, as I said, an equivariant function on the objects of B. It’s one way to “categorify” the view of a state as a vector in a Hilbert space, or map from \mathbb{C} (i.e. a point in the projective Hilbert space of lines in the Hilbert space H = \mathbb{C}[\underline{B}]), which is really what’s defined by one of these ensembles.

The idea of 2-linearization is to categorify, not a specific state \phi \in H, but the concept of state. So it should be a 2-vector in a 2-Hilbert space associated to B. The Hilbert space H was some space of functions into $mathbb{C}$, which we categorify by taking instead of a base field, a base category, namely \mathbf{Vect}_{\mathbb{C}}. A 2-Hilbert space will be a category of functors into \mathbf{Vect}_{\mathbb{C}} – that is, the representation category of the groupoid B.

(This is all fine for finite groupoids. In the inifinte case, there are some issues: it seems we really should be thinking of the 2-Hilbert space as category of representations of an algebra. In the finite case, the groupoid algebra is a finite dimensional C*-algebra – that is, just a direct sum (over iso. classes of objects) of matrix algebras, which are the group algebras for the automorphism groups at each object. In the infinite dimensional world, you probable should be looking at the representations of the von Neumann algebra completion of the C*-algebra you get from the groupoid. There are all sorts of analysis issues about measurability that lurk in this area, but they don’t really affect how you interpret “state” in this picture, so I’ll skip it.)

A “2-state”, or 2-vector in this Hilbert space, is a representation of the groupoid(-algebra) associated to the system. The “pure” states are irreducible representations – these generate all the others under the operations of the 2-Hilbert space (“sum”, “scalar product”, etc. in their 2-vector space forms). Now, an irreducible representation of a von Neumann algebra is called a “superselection sector” for a quantum system. It’s playing the role of a pure state here.

There’s an interesting connection here to the concept of state as a functional on a von Neumann algebra. As I described in the last post, the GNS representation associates a representation of the algebra to a state. In fact, the GNS representation is irreducible just when the state is a pure state. But this notion of a superselection sector makes it seem that the concept of 2-state has a place in its own right, not just by this correspondence.

So: if a quantum system is represented by an algebra \mathcal{A} of operators on a Hilbert space H, that representation is a direct sum (or direct integral, as the case may be) of irreducible ones, which are “sectors” of the theory, in that any operator in \mathcal{A} can’t take a vector out of one of these “sectors”. Physicists often associate them with conserved quantities – though “superselection” sectors are a bit more thorough: a mere “selection sector” is a subspace where the projection onto it commutes with some subalgebra of observables which represent conserved quantities. A superselection sector can equivalently be defined as a subspace whose corresponding projection operator commutes with EVERYTHING in \mathcal{A}. In this case, it’s because we shouldn’t have thought of the representation as a single Hilbert space: it’s a 2-vector in \mathbb{Rep}(\mathcal{A}) – but as a direct integral of some Hilbert bundle that lives on the space of irreps. Those projections are just part of the definition of such a bundle. The fact that \mathcal{A} acts on this bundle fibre-wise is just a consequence of the fact that the total H is a space of sections of the “2-state”. These correspond to “states” in usual sense in the physical interpretation.

Now, there are 2-linear maps that intermix these superselection sectors: the ETQFT picture gives nice examples. Such a map, for example, comes up when you think of two particles colliding (drawn in that world as the collision of two circles to form one circle). The superselection sectors for the particles are labelled by (in one special case) mass and spin – anyway, some conserved quantities. But these are, so to say, “rest mass” – so there are many possible outcomes of a collision, depending on the relative motion of the particles. So these 2-maps describe changes in the system (such as two particles becoming one) – but in a particular 2-Hilbert space, say \mathbb{Rep}(X) for some groupoid X describing the current system (or its algebra), a 2-state \Phi is a representation of the of the resulting system). A 2-state-vector is a particular representation. The algebra \mathcal{A} can naturally be seen as a subalgebra of the automorphisms of \Phi.

So anyway, without trying to package up the whole picture – here are two categorified takes on the notion of state, from two different points of view.

I haven’t, here, got to the business about Tomita flows coming from states in the von Neumann algebra sense: maybe that’s to come.

In my post about my short talk at CQC, I mentioned that the groupoidification program in physics is based on a few simple concepts (most research programs are, I suppose). The ones I singled out are: state, symmetry, and history. But since concepts tend to seem simpler if you leave them undefined, there are bound to be subtleties here. Recently I’ve been thinking about the first one, state. What is a state? What is this supposedly simple concept?

Etymology isn’t an especially reliable indicator of what a word means, or even the history of a concept (words change meanings, and concepts shift over time), but it’s sometimes interesting to trace. The English word “state” comes from the Latin verb stare, meaning “to stand”, whose past participle is status, which is also borrowed directly into English. The Proto-Indoeuropean root sta- also means “stand”, which in turn comes from this root, but this time via Germanic (along with “standard”). However, most of the words with this root come via various Latin intermediaries: state, stable, status, statue, stationary, station, and also substance, understand and others. The state of affairs is sometimes referred to as being “how things stand”, how they are, the current condition. Most of the words based on the sta- root imply non-motion (i.e. “stasis”). If anything, “state” (like “status”) carries this connotation less strongly than most, since the state of affairs can change – but it emphasizes how things stand now and not how they’re changing. From this sense, we also get the political meaning of “a state”, a reified version of a term originally meaning the political condition of a country (by analogy with Latin expressions like status rei publicae, the “condition of public affairs”).

So, narrowing focus now, the “state” of a physical system is the condition it’s in. In different models of physics, this is described in different ways, but in each case, by the “condition” we mean something like a complete description of all the facts about the system we can get. But this means different things in different settings. So I just want to take a look at some of them.

Think of these different settings for physics as being literally “settings” (but please excuse the pun) of the switches on a machine. Three of the switches are labelled Thermal, Quantum, and Relativistic. The “Thermal” switch varies whether or not we’re talking about thermodynamics or ordinary mechanics. The “Quantum” switch varies whether we’re talking about a quantum or classical system.

The “Relativistic” switch, which I’ll ignore for this post, specifies what kind of invariance we have: Galileian for Newton’s physics; Lorentzian for Special Relativity; general covariance for General Relativity. But this gets into dynamics, and “state” implies things are, well, static – that is, it’s about kinematics. At the very least, in Relativity, it’s not canonical what you mean by “now”, and so the definition of a state must include choosing a reference frame (in SR), or a Cauchy hypersurface (in GR). So let’s gloss over that for now.

When all these switches are in the “off” position, we have classical mechanics. Here, we think of a state as – at a first level of approximation, an element of a set. Now, for serious classical mechanics, this set will be a symplectic manifold, like the cotangent bundle T^*M of some manifold M. This is actually a bit subtle already, since a point in T^*M represents a collection of positions and momenta (or some generalization thereof): that is, we can start with a space of “static” configurations, parametrized by the values of some observable quantities, but a state (contrary to what etymology suggests) also includes momenta describing how those quantities are changing with time (which, in classical mechanics, is a fairly unproblematic notion).

The Hamiltonian picture of the dynamics of the system then tells us: given its state, what will be the acceleration, which we can then use to calculate states at future time. This requires a Hamiltonian, H, which we think of as the energy, which can be calculated from the state. So, for example, kinetic plus potential energy: in the case of a particle moving in a potential on a line, H = K + V = p^2/m + V(q). The space of states can be described without much reference to the Hamiltonian, but once we have H, we get a flow on that space, transforming old states into new states with time.

Now if we turn on the “Thermal” switch, we have a different notion of state. The standard image for the classical mechanical system is that we may be talking about a particle, or a few particles, or perhaps a rigid object, moving in space, maybe subject to some constraints. In thermodynamics, we are thinking of a statistical ensemble of objects – in the simplest case, N identical objects – and want to ask how energy is distributed among them. The standard image is of a box full of gas at some temperature: it’s full of molecules, each with its own trajectory, and they interact through collisions and exchange energy and momentum. Rather than tracking the exact positions of molecules, in thermodynamics a “state” is a distribution, or more precisely a probability measure, on the space of such states. We don’t assume we know the detailed microstate of the system – the positions and momenta of all the particles in the gas – but only something about how these are distributed among them. This reflects the real fact that we can only measure things like pressure, temperature, etc. The measure is telling us the proportion of particles with positions and momenta in a given range.

This is a big difference for something described by the same word “state”. Even assuming our underlying space of “microstates” is still the same T^*M, the state is no longer a point. One way to interpret the difference is that here the state is something epistemic. It describes what we know about the system, rather than everything about it. The measure answers the question: “given what we know, what is the likelihood the system is in microstate X?” for each X. Now, of course, we could take a space of all such measures: given our previous classical system, it’s a space of functionals on C(T^*M). Then the state can again be seen as an element of a set. But it’s more natural to keep in view its nature as a measure, or, if it’s nice enough, as a positive function on the space of states. (It’s interesting that this is an object of the same type as the Hamiltonian – this is, intuitively, the basis of what Carlo Rovelli calls the “Thermal Time Hypothesis”, summarized here, which is secretly why I wanted to write on this topic. But more on that in a later post. For one thing, before I can talk about it, I have to talk about what comes next.)

Now turn off the “Thermal” switch, and think about the “Quantum” switch. Here there are a couple of points of view.

To begin with, we describe a system in terms of a Hilbert space, and a state is a vector in a Hilbert space. Again, this could be described as an element of a set, but the complex linear structure is important, so we keep thinking of it as fundamental to the type of a state. In geometric quantization, one often starts with a classical system with a state space like T^*M = X, and then takes the Hilbert space \mathcal{H}=L^2(X), so that a state is (modulo analysis issues) basically a complex-valued function on X. This is something like the (positive real-valued) measure which gives a thermodynamic state, but the interpretation is trickier. Of course, if \mathcal{H} is an L^2-space, we can recover a probability measure, since the square modulus of \phi \in \mathcal{H} has finite total measure (so we can normalize it). But this isn’t enough to describe \phi, and the extra information of phases goes missing. In any case, the probability measure no longer has the obvious interpretation of describing the statistics of a whole ensemble of identical systems – only the likelihood of measuring particular values for one system in the state \phi. (In fact, there are various no-go theorems getting in the way of a probablity interpretation of \phi, though this again involves dynamics – a recurring theme is that it’s hard to reason sensibly about states without dynamics). So despite some similarity, this concept of “state” is very different, and phase is a key part of how it’s different. I’ll be jiggered if I can say why, though: most of the “huh?” factor in quantum mechanics lives right about here.

Another way to describe the state of a quantum system is related to this probability, though. The inner product of \mathcal{H} (whether we found it as an L^2-space or not) gives a way to talk about statistics of the system under repeated observations. Observables, which for the classical picture are described by functions on the state space X, are now self-adjoint operators on \mathcal{H}. The expectation value for an observable A in the state \phi is $\langle \phi | A | \phi \rangle$ (note that the Dirac notation implicitly uses self-adjointness of A). So the state has another, intuitively easier, interpretation: it’s a real-valued functional on observables, namely the one I just described.

The observables live in the algebra \mathcal{A} = \mathcal{B}(\mathcal{H}) of bounded operators on \mathcal{H}. Setting both Thermal and Quantum switches of our notion of “state” gives quantum statistical mechanics. Here, the “C*-algebra” (or von Neumann-algebra) picture of quantum mechanics says that really it’s the algebra \mathcal{A} that’s fundamental – it corresponds to actual operations we can perform on the system. Some of them (the self-adjoint ones) represent really very intuitive things, namely observables, which are tangible, measurable quantities. In this picture, \mathcal{H} isn’t assumed to start with at all – but when it is, the kind of object we’re dealing with is a density matrix. This is (roughly) a positive operator on \mathcal{H} of unit trace). In general a state on a von Neumann algebra is a linear functional with unit trace.

This is analogous to the view of a state as a probability measure (positive function with unit total integral) in the classical realm: if an observable is a function on states (giving the value of that observable in each state), then a measure is indeed a functional on the space of observables. A probability measure, in fact, is the functional giving the expectation value of the observable. (And, since variance and all the higher moments of the probability distribution for that observable are themselves defined as expectation values, it also tells us all of those.)

On the other hand, the Gelfand-Naimark-Segal theorem says that, given a state \phi : \mathcal{A} \rightarrow \mathbb{R}, there’s a representation of \mathcal{A} as an algebra of operators on some Hilbert space, and a vector v for which this \phi is just \phi(A) = \langle v | A | v \rangle. This is the GNS representation (and in fact it’s built by taking the regular representation of \mathcal{A} on itself by multiplication, with \mathcal{A} made into a Hilbert space by definining the inner product to make this property work, and with v = 1). So the view here is that a state is some kind of operation on observables – a much more epistemic view of things. So although the GNS theorem relates this to the vector-in-Hilbert-space view of “state”, they are quite different conceptually. (For one thing, the GNS representation is giving a different Hilbert space for each state, which undermines the sense that the space of ALL states is fundamentally “there”, but in both pictures \mathcal{A} is the same for all states.)

(This von Neumann-algebra point of view, by the way, gets along nicely with the 2-Hilbert space lens for looking at quantum mechanics, which may partly bridges the gap between it and the Hilbert-space view. The category of representations of a von Neumann algebra is a 2-Hilbert space. A “2-vector” (or “2-state”, if you like) in this category is a representation of the algebra. So the GNS representation itself is a “2-state”. This raises the question about 2-algebras of 2-operators, and John Baez’ question: “What is the categorified GNS theorem?” But let’s leave 2-states for later along with the rest.)

So where does this leave us regarding the meaning of “state”? The classical view is that a state is an element of some (structured) set. The usual quantum picture is that a state is, depending on how precise you want to be, either a vector in a Hilbert space, or a 1-d subspace of that Hilbert space – that is, a point in the projective Hilbert space. What these two views have in common is that there is some space of all “possible worlds”, i.e. of all ways things can be in the system being studied. A state is then a way of selecting one of these. The difference is in what this space of possible worlds is like – that is, which category it lives in – and how exactly one “selects” a state. How they differ is in the possibility of taking combinations of states. As for selecting states, Sets is a Cartesian category, with a terminal object 1 = {*}: an element of a set is a map from 1 into it. Hilb is a monoidal category, but not Cartesian: selecting a single vector has no obvious categorical equivalent, though selecting a 1-D subspace amounts to a map from \mathbb{C} (up to isomorphism). So the model of an “element” isn’t a singleton, it’s the complex line – and it relates to other possible spaces differently: not as a terminal object, but as a monoidal unit. This is a categorical way of saying how the idea of “state” is structurally different.

The thermal point of view is a little more epistemically subtle: for both classical and quantum pictures, it’s best thought of as, not a possible world, but a function acting on observables (that is, conditions of knowledge). In the classical picture, this is directly related to a space of possible worlds – it’s a measure on it, which we can think of as saying how a large ensemble of systems are distributed in that space. In the quantum picture, in some ways the most (epistemically) natural view, in terms of von Neumann algebras, breaks the connection to this notion of “possible worlds” altogether, since \mathcal{A} has representations on many different Hilbert spaces?

So a philosophical question is: what do these different concepts have in common that lets us use them all to represent the “same” root idea? Without actually answering this, I’ll just mention that at some point I’d like to talk a bit about “2-states” as 2-vectors, and in general how to categorify everything above.

So this paper of mine was recently accepted by the Journal of Homotopy and Related Structures (the version that was accepted should be reflected on the arXiv by tomorrow – i.e. July 10 – I’m not sure about the journal ). It’s been a while since I sent out the earliest version, and most of the changes have involved figuring out who the audience is, and consequently what could be left out. I guess that’s a side-effect of taking an excerpt from my thesis, which was much longer. In any case, it now seems to have reached a final point. Some of what was in it – the section about cobordisms – is now in a paper (in progress) about TQFT. I don’t see anywhere else to include the other missing bit, however, which has to do with Lawvere theories, and since I just wrote a bunch about MakkaiFest, I thought I might include some of that here.

The paper came about because I was trying to write my thesis, which describes an extended TQFT as a 2-functor (and considers how it could produce a version of 3D quantum gravity). The 2-functor

Z_G : nCob_2 \rightarrow 2Vect

(or into 2Hilb) is an ETQFT. The construction of the 2-functor uses the fact that you can get spans of groupoids out of cospans of manifolds – and in particular, out of cobordisms. One problem is how to describe nCob_2 so that this works. It’s actually most naturally a cubical 2-category of some kind. The strict version of this concept is a double category – which has (in principle separate) categories of horizontal and vertical of morphisms, as well as square 2-cells. Ideally, one would like a “weak” version, where composition of squares and morphisms can be only weakly associative (and have weak unit laws). A “pseudocategory” implements this where the only higher-dimensional morphisms are the squares, but it turns out to be strict in one direction, and weak in the other. As it happens, it’s a big pain to use only squares for the 2-morphisms.

Initially it seemed I would have to define a whole new structure to get weak composition in both directions, because in both directions, composition represents gluing bits of manifolds together along boundaries – using a diffeomorphism (or a smooth homeomorphism, depending on which kind of manifolds we’re dealing with). I called it a “double bicategory” and started trying to define it along the same lines as a double category. It then turned out that Dominic Verity had already defined a “double bicategory” – you can read the paper where I talk about how the notions are related. Here I want to talk about a few aspects which I cut out of the paper along the way.

The idea is that there are two ways of “categorifying”: internalization, and enrichment. A bicategory is a category enriched in Cat, the category of categories – for any two elements, there’s a whole hom-category of morphisms (and 2-morphisms). A double category is a category internal to Cat. This means you can think of it as a category of objects and a category of morphisms, equipped with functors satisfying all the usual properties for the maps in the definition of a category: composition functors, unit functors, and so forth. This definition turns out to be equivalent to the usual one. So I thought: why not do the same with bicategories?

Thus, the way I defined double bicategory was: “A bicategory internal to Bicat“. In the paper as it stands, that’s all I say. What I cut out was a sort of dangling loose end pointing toward Lawvere theories – or rather, a variant thereof – finite limit theories (for something more detailed, see this recent paper by Lack and Rosicky). As I mentioned in the previous post, a Lawvere theory is an approach to universal algebra – it formally defines a kind of object (e.g. group, ring, abelian group, etc.) as a functor from a category T which is the “theory” of such objects, while the functor is a “model” of the theory.

What makes it “universal” algebra is that it can involve definitions with many sorts of objects, many operations, given as arrows, of different arities (number of inputs and outputs). This last makes sense in the monoidal context, and in particular Cartesian. Making decisions like this – what class of categories and functors we’re dealing with – specifies which doctrine the theory lives in. In the case of bicategories, this is the doctrine of categories with finite limits. In a Lawvere theory in the original sense, the doctrine is categories with finite products – so if there’s an object G, there are also objects G^n for all n. Then there are things like multiplication maps m : G^2 \rightarrow G and so on. For a category or bicategory, multiplication might be partial – so we need finite limits. A model of a theory in this doctrine is a limit-preserving functor.

So what does the theory of bicategories look like? It’s easy enough to see if you think that a (small) bicategory is a “bicategory in Sets“, and reproduce the usual definition, omitting reference to sets. It has objects Ob, Mor, and 2Mor. (This fact already means this is a “multi-sorted” theory, which goes beyond what can be done with another approach to universal algebra based on monads). Funthermore, there are maps between these objects, interpreted as source, target, and identity maps of various sorts. These form diagrams, and since we’re in a finite limit theory, there must be various objects like Pairs = Mor \times_{Ob} Mor which for sets would have the interpretation “pairs of composable morphisms”. Then there’s a composition map \circ : Pairs \rightarrow Mor… and so on. In short, in describing the axioms for a bicategory in a “nice” way (i.e. in terms of arrows, commuting diagrams, etc.), we’re giving a presentation of a certain category, Th(Bicat), in generators and relations. Then a model of the theory is a functor Th(Bicat) \rightarrow \mathcal{C} – picking out a “bicategory in \mathcal{C}“.

Now, a bicategory in Sets is a bicategory. But a bicategory in Bicat is another matter. First of all, I should say there’s something kind of odd here, since Bicat is most naturally regarded as a tricategory. However, we can regard it as a category by disregarding higher morphisms and taking 2-functors only up to equivalence to make Bicat into an honest category with associative composition. Thus, if we have a functor F : Th(Bicat) \rightarrow Bicat, we have:

  • Bicategories F(Ob), latex $F(Mor)$, and F(2Mor)
  • 2-Functors F(s), F(\circ) and so on
  • satisfying conditions implied by the bicategory axioms

But each of those bicategories (in Sets!) has sets of objects, morphisms, and 2-morphisms, and one can break all the functors apart into three collections of maps acting on each of these three levels. They’ll satisfy all the conditions from the axioms – in fact, they make three new bicategories. So, for example, the object-sets of the bicategories F(Ob), F(Mor) and F(2Mor) form a bicategory using the object maps of the 2-functors F(s) and so on.

So if we say the original bicategories F(Ob) and so on are “horizontal”, and these new ones are “vertical”, we have something resembling a double category, but weak (since bicategories are weak) in both directions. The result is most naturally a four-dimensional structure (the 2-morphisms in 2Mor are most conveniently drawn as 4d, which is shown in Table 2 of the paper).

Now, the paper as it is describes all this structure without explicitly mentioning the theory Th(Bicat) except in passing – one can define “internal bicategory” without it. This is why this is a “loose end” of this paper: a major benefit of using Lawvere-style theories is the availability of morphisms of theories, which don’t come up here.

In any case, with this 4D structure in hand, what I do in the paper is (a) get some conditions that allow one to decategorify it down to Verity’s version of “double bicategory” (and even down to a bicategory); and (b) show that couble cospans are an example (double spans would do equally well, but the application is to cobordisms, which are cospans). My own reason for wanting to get down to a 2D structure is the application to extended TQFT, which means we want a 2-category of cobordisms, thought of in terms of (co)spans.

Maybe in a subsequent post I’ll talk about the example itself, but one point about internalization does occur to me. Double cospans give an example of a double bicategory in the sense above – a strict model of Th(Bicat) in Bicat. In fact, they consist of “(co)spans of (co)spans” in a way that Marco Grandis formalized in terms of powers \Lambda^n, where \Lambda is the diagram (i.e. category) \bullet \leftarrow \bullet \rightarrow \bullet. One can actually think of this in terms of internalization: these are spans in a category whose objects are spans in \mathcal{C}, and whose morphisms are triples of maps in C linking two spans (likewise for the span-map 2-morphisms). Yet it’s manifestly edge-symmetric: both the horizontal and vertical bicategories are the same.

As I mentioned in the previous post, there are lots of nice examples of double categories which are not edge-symmetric – sets, functions, and relations; or rings, homomorphisms, and bimodules, say. In fact, the second is only a pseudocategory – weak in one direction (composition of bimodules by tensor product is really only defined up to isomorphism). This is a significant thing about non-edge-symmetric examples. There’s much less motive for assuming both directions are equally strict. It’s also more natural in some ways: a pseudocategory is a weak model of Th(Cat) in Cat – equations in the theory are represented by (coherent) isomorphisms. This is the most general situation, and a strict model is a special case.

In the bicategory world, as I said, Bicat is a tricategory, so weaker models than the one I’ve given are possible – though they’re not symmetric, and so while one direction has composition and units as weak as a bicategory, the other direction will be weaker still. Robert Paré, in a conversation at MakkaiFest, suggested that a nice definition for a cubical n-category might have each direction being one step weaker than the previous one – a natural generalization of pseudocategories. Maybe there’s a way to make this seem natural in terms of internalization? One can iterate internalizing: having defined double bicategories, collect them together and find models of Th(Bicat) in DblBicat, and so forth. Maybe doing this as weakly as possible would give this tower of increasing weakness.

Now, I don’t have a great punchline to sum all this up, except that internalization seems to be an interesting lens with which to look at cubical n-categories.

It’s taken me a while to write this up, since I’ve been in the process of moving house – packing and unpacking and all the rest. However, a bit over a week ago, I was in Montreal, attending MakkaiFest ‘09 at the Centre de Recherches Mathematiques at the University of Montréal (and a pre-conference workshop hosted at McGill, which I’m including in the talks I mention here). This was in honour of the 70th birthday of Mihaly (Michael) Makkai, of McGill University. Makkai has done a lot of important foundational work in logic, model theory, and category theory, and a great many of the talks were from former students who’d gone on and been inspired by him, so one got sense of the range of things he’s worked on through his life.

The broad picture of Makkai’s work was explained to us by J.P. Marquis, from the Philosophy department at U of M. He is interested in philosophy of mathematics, and described Makkai’s project by contrast with the program of axiomatization of the early 20th century, along the lines suggested by Hilbert. This program provided a formal language for concrete structures – the problem, which category theory is part of a solution to, is to do the same for abstract structures. Contrast, for instance, the concrete description of a group G as a (particular) set with some (particular) operation, with the abstract definition of a group object in a category. Makkai’s work in categorical logic, said Marquis, is about formalizing the process of abstraction that example illustrates.

Model Theory/Logic

This matter – of the relation between abstract theories and concrete models of the theories – is really what model theory is about, and this is one of the major areas Makkai has worked on. Roughly, a theory is most basically a schema with symbols for types, members of types, and some function symbols – and a collection of sentences built using these symbols (usually generated from some axioms by rules of logical inference). A model is (intuitively), an interpretation of the terms: a way of assigning concrete data to the symbols – say, a symbol for a type is assigned the set of all entities of that type, and a function symbol is assigned an actual function between sets, and so on – making all propositions true. A morphism of models is a map that preserves all the properties of the model that can be stated using first order logic.

This is an older way to say things – Victor Harnik gave an expository talk called “Model Theory vs. Categorical Logic” in which he compared two ways of adding an equivalence relation to a theory. The model theory way (invented by Shelah) involves taking the theory (list of sentences) T and extending it to a new theory T^{eq}. This has, for instance, some new types – if we had a type for “element of group”, for example, we might then get a new type “equivalence class of elements of group”, and so on. Now, this extension is “tight” in the sense that the categories of all models of T and of T^{eq} are equivalent (by a forgetful functor Mod(T^{eq}) \rightarrow Mod(T)) – but one can prove new theorems in the extended theory. To make this clear, he described work (due to Makkai and Reyes) about pretopos completion. Here, one has the concept of a “Boolean logical category” – Set is an example, as is, for any theory, a certain category whose objects are the formulas of the theory. This is related to Lawvere theories (see below). There are logical functors between such categories – functors into Set are models, but there are also logical functors between theories. The point is that a theory T embeds into T^{eq} (abusing notation here – these are now the boolean logical categories). Then the point is that T^{eq} arises as a kind of completion of T – namely, it’s a boolean pretopos (not just category). Moreover, it has some nice universal properties, making this point of view a bit more natural than the model-theoretic construction.

Bradd Hart’s talk, “Conceptual Completeness for Cantinuous Logic”, was a bit over my head, but made some use of this kind of extension of a theory to T^{eq}. The basic point seems to be to add some kind of continuous structure to logic. One example comes from a metric structure – defining a metric space of terms, where the metric function d(x,y) is some sum \sum_n \phi_n (x,y), where the \phi_n are formulas with two variables, either true or false – where true gives a 0, and false gives a 1 in this sum. This defines a distance from x to y associated to the given list of formulas \phi_n. A continuous logic is one with a structure like this. The business about equivalence relations arises if we say two things are equivalent when the distance between them is 0 – this leads to a concept of completion, and again there’s a notion that the categories of models are equivalent (though proving it here involves some notion of approximating terms to arbitrary epsilon, which doesn’t appear in standard logic).

Anand Pillay gave a talk which used model theory to describe some properties of the free group on n generators. This involved a “theory of the free group” which applies to any free group, and regard each such group as a model of the theory – in fact a submodel of some large model, and using model-theoretic methods to examine “stability” properties, in some sense which amounts to a notion of defining “generic” subsets of the group.

Logic and Higher Categories

A number of talks specifically addressed the ground where logic meets higher dimensional categories, since Makkai has worked with both.

In one talk, Robert Paré described a way of thinking about first-order theories as examples of “double Lawvere theories”. Lawvere’s way of formalizing “theories and models” was to say that the theory is a category itself (which has just the objects needed to describe the kind of structure it’s a theory of) – and a model is a functor into Sets (or some other category – a model of the theory of groups in topological spaces, say, is a topological group). For example, the theory of groups includes an object G and powers of it, multiplication and inverse maps, and expresses the axioms by the fact that certain diagrams commute. A model is a functor M : Th(Grp) \rightarrow Sets, assigning to the “group object” a set of elements, which then get the group structure from the maps. Instead of a category, this uses a double category. There are two kinds of morphisms – horizontal and vertical – and these are used to represent two kinds of symbols: function symbols, and relation symbols. (For example, one can talk about the theory of an ordered field – so one needs symbols for multiplication and addition and so forth, but also for the order relation \leq). Then a model of such a theory is a double functor into the double category whose objects are sets, and whose horizontal and vertical morphisms are respectively functions and relations.

André Joyal gave a talk about the first order logic of higher structures. He started by commenting on some fields which began life close together, and are now gradually re-merging: logic and category theory; category theory and homotopy theory (via higher categories); homotopy theory and algebraic geometry. The higher categories Joyal was thinking of are quasicategories, or “( \infty, 1)-categories, which are simplicial sets satisfying a weak version of a horn-filling condition (the “strict” version of this, a Kan complex, includes as example N(C), the nerve of a category C – there’s an n-simplex for each sequence of n composable morphisms, whose other edges are the various composites, and whose faces are “compositors”, “associators”, and so on – which for N(C) are identities). The point of this is that one can reproduce most of category theory for quasicategories – in particular, he mentioned limits and colimits, factorization systems, pretoposes, and model theory.

Moving to quasicategories on one side of the parallel between category theory and logic has a corresponding move on the other side – on the logic side, one aspect is that the usual notion of a language is replaced by what’s called Martin-Löf type theory. This, in fact, was the subject of Michael Warren’s talk, “Martin-Löf complexes” (I reported on a similar talk he gave at Octoberfest last year). The idea here is to start by defining a globular set, given a theory and type A – a complex whose n-cells have two faces, of dimension (n-1). The 0-cells are just terms of some type A. The 1-cells are terms of types like \underline{A}(a,b), where a and b are variables of type A – the type has an interpretation as a proposition that a=b “extensionally” (i.e. not via a proof – but as for instance when two programs with non-equivalent code happen to always produce the same output). This kind of operation can be repeated to give higher cells, like \underline{A(a,b)}(f,g), and so on. Given a globular set G, one gets a theory by an adjoint construction. Putting the two together, one has a monad on the category of globular sets – algebras for the monad are Martin-Löf complexes. Throwing in syntactic rules to truncate higher cells (I suppose by declaring all cells to be identities) gives n-truncated versions of these complexes, MLC_n. Then there is some interesting homotopy theory, in that the category of n-truncated Martin-Löf complexes is expected to be a model for homotopy n-types. For example, MLC_0 is equivalent to Sets, and there is an adjunction (in fact, a Quillen equivalence – that is, a kind of “homotopy” equivalence) between MLC_1 and Gpd.

Category Theory/Higher Categories

There were a number of talks that just dealt with categories – including higher categories – in their own right. Makkai has worked, for example, on computads, which were touched on by Marek Zawadowski in one of his two talks (one in the pre-conference workshop, the other in the conference). The first was about categories of “many-to-one shapes”, which are important to computads – these are a notion of higher-category, where every cell takes many “input” faces to one “output” face. Zawadowski described a “shape” of an n-cell as an initial object in a certain category built from the category of computads with specified faces. Then there’s a category of shapes, and an abstract description of “shape” in terms of a graded tensor theory (graded for dimension, and tensor because there’s a notion of composition, I believe). Zawadowski’s second talk, “Opetopic Sets in Lax Monoidal Fibrations”, dealt with a similar topic from a different point of view. A lax monoidal fibration (LMF) is a kind of gadget for dealing with multi-level structures (categories, multicategories, quasicategories, etc). There’s a lot of stuff here I didn’t entirely follow, but just to illustrate: categories arise as LMF, by the fibration cod : Set^{B} \rightarrow Set, where B is the category with two objects M, O, and two arrows from M to O. An object in the functor category Set^{B} consists of a “set of morphisms and set of objects” with maps – making this a category involves the monoidal structure, and how composition is defined, and the real point is that this is quite general machinery.

Joachim Lambek and Gonzalo Reyez, both longtime collaborators and friends of Makkai, also both gave talks that touched on physics and categories, though in very different ways. Lambek talked about the “Lorentz category” and its appearance in special relativity.  This involves a reformulation of SR in terms of biquaternions: like complex numbers, these are of the form u + iv, but u and v are quaternions.  They have various conjugation operations, and the geometry of SR can be described in terms of their algebra (just as, say, rotations in 3D can be described in terms of quaternions).  The Lorentz category is a way of organizing this – its two objects correspond to “unconjugated” and “conjugated” states.

Gonzalo Reyez gave a derivation of General Relativity in the context of synthetic differential geometry.  The substance of this derivation is not so different from the usual one, but with one exception.  Einstein’s field equations can be derived in terms of the motions of small regions full of of freely falling test particles – synthetic differential geometry makes it possible to do the same analysis using infinitesimals rigorously all the way through.  The basic point here is that in SDG one replaces the real line as usually conceived, with a “real line with infinitesimals” (think of the ring \mathbb{R}[\epsilon]/\langle \epsilon^2 \rangle, which is like the reals, but has the infinitesimal \epsilon, whose square is zero).

Among other talks: John Power talked about the correspondence between Lawvere theories in universal algebra and finitary tree monads on sets – and asked about what happens to the left hand side of this correspondence when we replace “sets” with other categories on the righ hand side. Jeff Egger talked about measure theory from a categorical point of view – namely, the correspondence of NCG between C*-algebras and “noncommutative” topological spaces, and between W*-algebras and “noncommutative” measure spaces, thought of in terms of locales. Hongde Hu talked about the “codensity theorem”, and a way to classify certain kinds of categories – he commented on how it was inspired by Makkai’s approach to mathematics: 1) Find new proofs of old theorems, (2) standardize the concepts used in them, and (3) prove new theorems with those concepts. Fred Linton gave a talk describing Heath’s “V-space”, which is a half-plane with a funny topology whose open sets are “V” shapes, and described how the topos of locally finite sheaves over it has surprising properties having to do with nonexistence of global sections. Manoush Sadrzadeh, whom I met recently at CQC (see the bottom of the previous post) was again talking about linguistics using monoidal categories – she described some rules for “clitic movement” and changes in word order, and what these rules look like in categorical terms.

Other

A few other talks are a little harder for me to fit into the broad classification above.  There was Charles Steinhorn’s talk about ordered “o-minimal” structures, which touched on a bit of economics – essentially, a lot of economics is based on the assumption that preference orders can be made into real-valued functions, but in fact in many cases one has (variants on) “lexicographic order”, involving ranked priorities.  He talked about how typically one has a space of possibilities which can be cut up into cells, with one sort of order in each cell.  There was Julia Knight, talking about computable structures of “high Scott rank” – in particular, this is about infinite structures that can still be dealt with computably – for example, infinitary logical formulas involving an infinite number of “OR” statements where all the terms being joined are of some common form.  This ends up with an analysis of certain infinite trees.  Hal Kierstead gave a talk about Ramsey theory which I found notable because it used the kind of construction based on a game: to prove that any colouring of a graph (or hypergraph) has some property, one devises a game where one player tries to build a graph, and the other tries to colour it, and proves a winning strategy for one player.  Finally, Michael Barr gave a talk about a duality between certain categories of modules over commutative rings.

All in all, an interesting conference, with plenty of food for thought.

Barr, Kierstead, Knight, Steinhorn

Continuing from the previous post…

I realized I accidentally omitted Klaas Lansdman’s  talk on the Kochen-Specker theorem, in light of topos theory.  This overlaps a lot with the talk by Andreas Doring, although there are some significant differences.  (Having heard only what Andreas had to say about the differences, I won’t attempt to summarize them).  Again, the point of the Kochen-Specker theorem is that there isn’t a “state space” model for a quantum system – in this talk, we heard the version saying that there are no “locally sigma-Boolean” maps, from operators on a Hilbert space, to \{ 0, 1 \}.  (This is referring to sigma-algebas (of measurable sets on a space), and Boolean algebras of subsets – if there were such a map, it would be representing the system in terms of a lattice equivalent to some space).  As with the Isham/Doring approach, they then try to construct something like a state space – internal to some topos.  The main difference is that the toposes are both categories of functors into sets from some locale – but here the functors are covariant, rather than contravariant.

Now, roughly speaking, the remaining talks could be grouped into two kinds:

Quantum Foundations

Many people came to this conference from a physics-oriented point of view.  So for instance Rafael Sorkin gave a talk asking “what is a quantum reality?”. He was speaking from a “histories” interpretation of quantum systems. So, by contrast, a “classical reality” would mean one worldline: out of some space of histories, one of them happens. In quantum theory, you typically use the same space of histories, but have some kind of “path integral” or “sum over histories” when you go to compute the probabilities of given events happening. In this context, “event” means “a subset of all histories” (e.g. the subset specified by a statement like “it rained today”). So his answer to the question is: a reality should be a way of answering all questions about all events.  This is called a “coevent”.  Sorkin’s answer to “what is a quantum reality?” is: “a primitive, preclusive coevent”.

In particular, it’s a measure \mu.  For a classical system, “answering” questions means yes/no, whether the one history is in a named event – for a quantum system, it means specifying a path integral over all events – i.e. a measure on the space of events.  This measure needs some nice properties, but it’s not, for instance, a probability measure (it’s complex valued, so there can be interference effects).  Preclusion has to do with the fact that the measure of an event being zero means that it doesn’t happen – so one can make logical inferences about which events can happen.

Other talks addressing foundational problems in physics included Lucien Hardy’s: he talked about how to base predictive theories on operational structures – and put to the audience the question of whether the structures he was talking about can be represented categorically or not.  The basic idea is an “operational structure” is some collection of operations that represents a physical experiment whose outcome we might want to predict.  They have some parameters (“knob settings”), outcomes (classical “readouts”), and inputs and outputs for the things they study and affect (e.g. a machine takes in and spit out an electron, doing something in the middle).  This sort of thing can be set up as a monoidal category – but the next idea, “object-oriented operationalism”, involved components having “connections” (given relations between their inputs) and “coincidences” (predictable correlations in output).  The result was a different kind of diagram language for describing experiments, which can be put together using a “causaloid product” (he referred us to this paper, or a similar one, on this).

Robert Spekkens gave a talk about quantum theory as a probability theory – there are many parallels, though the complex amplitudes give QM phenomena like interference.  Instead of a “random variable” A, one has a Hilbert space H_A; instead of a (positive) function of A, one has a positive operator on H_A; standard things in probability have analogs in the quantum world.  What Robert Spekkens’ talk dealt with was how to think about conditional probabilities and Bayesian inference in QM.  One of the basic points is that when calculating conditional probabilities, you generally have to divide by some probability, which encounters difficulties translating into QM.  He described how to construct a “conditional density operator” along similar lines – replacing “division” by a “distortion” operation with an analogous meaning.  The whole thing deeply uses the Choi-Jamiolkowski isomorphism, a duality between “states and channels”.  In terms of the string diagrams Bob Coecke et. al. are keen on, this isomorphism can be seen as taking a special cup which creates entangled states into an ordinary cup, with an operator on one side.  (I.e. it allows the operation to be “slid off” the cup).  The talk carried this through, and ended up defining a quantum version of the probabilistic concept of “conditional independence” (i.e. events A and C are independent, given that B occurred).

A more categorical look at foundational questions was given by Rick Blute’s talk on “Categorical Structures in AQFT”, i.e. Algebraic Quantum Field Theory.  This is a formalism for QFT which takes into account the causal structure it lives on – for example, on Minkowski space, one has a causal order for points, with x \leq y if there is a future-directed null or timelike curve from x to y.  Then there’s an “interval” (more literally, a double cone) [x,y] = \{ z | x \leq z \leq y\}, and these cones form a poset under inclusion (so this is a version of the poset of subspaces of a space which keeps track of the causal structure).  Then an AQFT is a functor \mathbb{A} from this poset into C*-algebras (taking inclusions to inclusions): the idea is that each local region of space has its own algebra of observables relevant to what’s found there.  Of course, these algebras can all be pieced together (i.e. one can take a colimit of the diagram of inclusions coming from all regions on spacetime.  The result is \hat{\mathbb{A}}.  Then, one finds a category of certain representations of it on a hilbert space H (namely, “DHR” representations).  It turns out that this category is always equivalent to the representations of some group G, the gauge group of the AQFT.  Rick talked about these results, and suggested various ways to improve it – for example, by improving how one represents spacetime.

The last talk I’d attempt to shoehorn into this category was by Daniel Lehmann.  He was making an analysis of the operation “tensor product”, that is, the monoidal operation in Hilb.  For such a fundamental operation – physically, it represents taking two systems and looking at the combined system containing both – it doesn’t have a very clear abstract definition.  Lehmann presented a way of characterizing it by a universal property analogous to the universal definitions for products and coproducts.  This definition makes sense whenever there is an idea of a “bimorphism” – a thing which abstracts the properties of a “bilinear map” for vector spaces.  This seems to be closely related to the link between multicategories and monoidal categories (discussed in, for example, Tom Leinster’s book).

Categories and Logic

Some less physics-oriented and more categorical talks rounded out the part of the program that I saw.  One I might note was Mike Stay’s talk about the Rosetta Stone paper he wrote with John Baez.  The Rosetta Stone, of course, was a major archaeological find from the Ptolemaic period in Egypt – by that point, Egypt had been conquered by Alexander of Macedon and had a Greek speaking elite, but the language wasn’t widespread.  So the stone is an official pronouncement with a message in Greek, and in two written forms of Egyptian (heiroglyphic and demotic), neither of which had been readable to moderns until the stone was uncovered and correspondences could be deduced between the same message in a known language and two unknown ones.  The idea of their paper, and Mike’s talk, is to collect together analogs between four subjects: physics, topology, computation, and logic.  The idea is that each can be represented in terms of monoidal categories.  In physics, there is the category of Hilbert spaces; in topology one can look at the category of manifolds and cobordisms; in computation, there’s a monoidal category whose objects are data types, and whose morphisms are (equivalence classes) of programs taking data of one type in and returning data of another type; in logic, one has objects being propositions and morphisms being (classes) of proofs of one proposition from another.  The paper has a pretty extensive list of analogs between these domains, so go ahead and look in there for more!

Peter Selinger gave a talk about “Higher-Order Quantum Computation”.  This had to do with interesting phenomena that show up when dealing with “higher-order types” in quantum computers.  These are “data types”, as I just described – the “higher-order” types can be interpreted by blurring the distinction between a “system” and a “process”.  A data type describing a sytem we might act on might be A or B.  A higher order type like A \multimap B describes a process which takes something of type A and returns something of type B.  One could interpret this as a black box – and performing processes on a type A \multimap B is like studying that black box as a system itself.  This type is like an “internal hom” – and so one might like to say, “well, it’s dual to tensor – so it amounts to taking A^* \otimes B, since we’re in the category of Hilbert spaces”.  The trouble is, for physical computation, we’re not quite in the category where that works.  Because not all operators are significant: only some class of totally positive operators are physical.  So we don’t have the hom-tensor duality to use (equivalently, don’t have a well-behaved dual), and these types have to be considered in their own right.  And, because computations might not halt, operations studying a black box might not halt.  So in particular, a “co-co-qubit” isn’t the same as a qubit.  A co-qubit is a black box which eats a qubit and terminates with some halting probability.  A co-co-qubit eats a co-qubit and does the same.  If not for the halting probability, one could equally well see a qubit “eating” a co-co-qubit as the reverse.  But in fact they’re different.  A key fact in Peter’s talk is that quantum computation has new logical phenomena happening with types of every higher order.  Quantifying this (an open problem, apparently) would involve finding some equivalent of Bell inequalities that apply to every higher order of type.  It’s interesting to see how different quantum computing is, in not-so-obvious ways, from the classical kind.

Manoush Sadrzadeh gave a talk describing how “string diagrams” from monoidal categories, and representations of them, have been used in linguistics.  The idea is that the grammatical structure of a sentence can be build by “composing” structures associated to words – for example, a verb can be composed on left and right with subject and object to build a phrase.  She described some of the syntactic analysis that went into coming up with such a formalism.  But the interesting bit was to compare putting semantics on that syntax to taking a representation.  In particular, she described the notion of a semantic space in linguistics: this is a large-dimensional vector space that compares the meanings of words.  A rough but surprisingly effective way to clump words together by meaning just uses the statistics on a big sample of text, measuring how often they co-occur in the same context. Then there is a functor that “adds semantics” by mapping a category of string diagrams representing the syntax of sentences into one of vector spaces like this.  Applying the kind of categorical analysis usually used in logic to natural language seemed like a pretty neat idea – though it’s clear one has to make many more simplifying assumptions.

On the whole, it was a great conference with a great many interesting people to talk to – as you might guess from the fact that it took me three posts to comment on everything I wanted.

So as I mentioned in my previous post, I attended 80% of the conference “Categories, Quanta, Concepts”, hosted by the Perimeter Institute.  Videos of many of the talks are online, but on the assumption that not everyone will watch them all, I’ll comment anyway… ;)

It dealt with various takes on the uses of category theory in fundamental physics, and quantum physics particularly. One basic theme is that the language of categories can organize and clarify the concepts that show up here. Since there doesn’t seem to be a really universal agreement on what “fundamental” physics is, or what the concepts involved might be, this is probably a good thing.

There were a lot of talks, so I’ll split this into a couple of posts – this first one dealing with two obvious category-related themes – monoidal categories and toposes.  The next post will cover most of the others – roughly, focused on fundamentals of quantum mechanics, and on categories for logic and language.

Monoidal Categories

So a large contingent came from Oxford’s Comlab, many of them looking at ideas that I first saw popularized by Abramsky and Coecke about describing the features of quantum mechanics that appear in any dagger-compact category. This yields a “string diagram” notation for quantum systems. (An explanation of this system is given by Abramsky and Coecke – http://arxiv.org/abs/0808.1023 – or more concisely by Coecke – http://arxiv.org/abs/quant-ph/0510032).

Samson Abramsky talked about diagonal arguments. This is a broad class of arguments including Cantor’s theorem (that the real line is uncountable), Russell’s paradox in set theory (about the “set” of non-self-membered sets), Godel’s incompleteness theorem, and others. Abramsky’s talk was based on Bill Lawvere’s analysis of these arguments in general cartesian closed categories (CCC’s). The relevance to quantum theory has to do with “no-cloning” theorems – that quantum states can’t be duplicated. Diagonal arguments involve two capabilitiess: the ability to duplicate objects, and the ability to represent predicates (think of Godel numbering, for instance) which is related to a fixed point property. Generalizing to other monoidal categories, one still has representability: linear functionals on Hilbert spaces can be represented by vectors. But diagonal arguments fail since there is no diagonal \Delta : H \rightarrow H \otimes H.

Bob Coecke and Ross Duncan both spoke about “complementary observables”. Part of this comes from their notion of an “observable structure”, or “classical structure” for a quantum system. The intuition here is that this is some collection of observables which we can simultaneously observe, and such that, if we restrict to those observables, and states which are eigenstates for them, we can treat the whole system as if it were classical. In particular, this gives us “copy” and “destroy” operations for states – these maps and their duals actually turn out to define a Frobenius algebra. In finite-dimensional Hilbert spaces, this is equivalent to choosing an orthonormal basis.

Complementary observables is related to the concept of mutually unbiased bases. So the bases \{v_i\} and \{w_j\} are unbiased if all the inner products \langle v_i , w_j \rangle have the same magnitude. If these bases are associated to observables (say, they form a basis of eigenvectors), then knowing a classical value of one observable gives no information about the other – all eigenstates are equally likely. For a visual image, think of two sets of bases for the plane, rotated 45 degrees relative to each other. Each basis vector in one has a projection of equal length onto both basis vectors of the other.

Thinking of the orthonormal bases as “observable structures”, the mutually unbiased ones correspond to “complementary” observables: a state which is classical for one observable (i.e. is an eigenstate for that operator) is unbiased (i.e. has equal probablities of having any value) for the other observable. Labelling the different structures with colours (red and green, usually), they could diagrammatically represent states being classical or unbiased in particular systems.

This is where “phase groups” come into play. The setup is that we’re given some system – the toy model they often referred to was a spinning particle in 3D – and an observable system (say, just containing the observable “spin in the X direction”). Then there’s a group of symmetries of the system which leave that observable untouched (in that example, the symmetries are rotation about the X axis). This is the “phase group” for that observable.

Bill Edwards talked about phase groups and how they can be used to classify systems. He gave an example of a couple of toy models with six states each. One was based on spin (the six states describe spins about each axis in 3-space in each direction). The other, due to Robert Spekkens, is a “hidden variable” theory, where there are four possible “ontic” states (the “hidden” variable), but the six “epistemic” states only register whether the state lies in of six possible PAIRS of ontic states. The two toy models resemble each other at the level of states, but the phase groups are different: the truly “quantum” one has a cyclic group \mathbb{Z}_4 (for the X-spin observable, it’s generated by a right-angled rotation about the X axis); the “hidden variable” model, which has some quantum-mechanics-like features, but not all, has phase group \mathbb{Z}_2 \times \mathbb{Z}_2. The suggestion of the talk was that this phase group distinguishes “local” from “nonlocal” systems (i.e. ones with hidden variable models and ones without).

Marni Sheppard also gave a talk about Mutually Unbiased Bases, p-adic arithmetic, and algebraic geometry over finite fields, which I find hard to summarize because I don’t understand all those fields very well. Roughly, her talk made a link between quantum mechanics and an axiomatic version of projective geometry (Hilbert spaces in QM ought to be projective, after all, so this makes sense).  There was also a connection between mutually unbiased bases and finite fields, but again, this sort of escaped me.

Also in this group was Jamie Vicary, whom I’ve been working with on a project about the categorified harmonic oscillator.  His talk, however, was about n-Hilbert spaces, and n-categorical extended TQFT.  The basic point is that a TQFT assigns a number to a closed n-manifold, and a Hilbert space to each (n-1)-manifold (such as a boundary between two parts of a closed one), and if the TQFT is fully local (i.e. can be derived from, say, a triangulation), this can be continued to have it assign k-Hilbert spaces to (n-k)-manifolds for all k up to n.  He described the structure of 2-Hilbert spaces, and also monoidal ones (as many interesting cases are), and how they can all be realized (in finite dimensions, at least) as categories of representations of supergroupoids.  Part of the point of this talk was to suggest how not just dagger-compact categories, but general n-categories should be useful for quantum theory.

Toposes

The monoidal category setting is popular for dealing with quantum theories, since it abstracts some properties of Hilbert spaces, which they’re usually modelled in.  Topos theory is usually thought of as a generalization of the category of sets, and in particular they model intuitionistic classical, not quantum, logic.  So the talk by Andreas Döring (based on work with Christopher Isham – see many of Andreas’ recent papers) called “Why Topos Theory in the Foundations of Physics?” is surprising if you haven’t heard this idea before.  One motivation could be described in terms of the Kochen-Specker theorem, which, roughly, says that a quantum theory – involving observables which are operators on a Hilbert space of dimension at least three – can’t be modeled by a “state space”.  That is, it’s not the case that you can simultaneously give definite values to all the observables in a consistent way – in ANY state!  (That is, it’s not just the generic state: there is no state at all which corresponds to the classical picture of a “point” in some space parametrized by the observables.)

Now, part of the point is that there’s no “state space” in the category of sets – but maybe there is in some other topos!  And sure enough, the equivalent of a state space turns out to be a thing they call the “spectral presheaf” for the theory.  It’s an object in some topos.  The KS theorem becomes a statement that it has no “global points”.  To see what this means, you have to know what the spectral presheaf is.

This is based on the assumption that one has a (noncommutative) von Neumann algebra of operators on a Hilbert space – among them, the observables we might be interested in.  The structure of this algebra is supposed to describe some system.  Now you might want to look for subalgebras of it which are abelian.  Why?  Because a system of commuting operators, should they be observables, are ones which we CAN assign values to simultaneously – there’s no issue of which order we do measurements in.  Call this a “context” – a choice of subalgebra making the system look classical.  So maybe we can describe a “state space” in a context: so what?

Well, the collection of all such contexts forms a poset – in fact, lattice – in fact, a complete Heyting algebra.  These objects are just the same (object-wise) as “locales” (a generalization from topological spaces, and their lattice of open sets).  The topos in question is the category of presheaves on this locale, which is to say, of contravariant functors to Set.  Which is to say… a way of assigning a set (the “state space” I mentioned), with a way of restricting sets along inclusion maps.  This restriction can be a bit rough (in fact, the fact that restriction can be quite approximate is just where uncertainty principles and the like come from).  The main point is that this “spectral presheaf” (the assignment of local state spaces to each context) supports a concept of logic, for reasoning about the system it describes.  It’s a lot like the logic of sets, but operations happen “context-by-context”.  A proposition has a truth value which is a “downset” in the lattice of contexts – the collection of contexts where the proposition is true.  A proposition just amounts to a subobject of the spectral presheaf by what they call “daseinization” – it’s the equivalent of a proposition being a subset of a configuration space (where the statement is true).

One could say a lot more, but this is a blog post, after all.

There are philosophical issues that this subject seems to provoke – the sign of an interesting theory is that it gets people arguing, I suppose.  One is the characterization of this as a “neo-realist interpretation” of quantum theory.  A “naive realist” interpretation would be one that says a “state” is just a way of saying what all the values of all the observable quantities is – to put it another way, of giving definite truth values to all definite “yes/no” questions.  This is just what the KS theorem says can’t happen.  The spectral presheaf is supposedly “neo-realist” because it does almost these things, but in an exotic topos (of presheaves on the locale of all classical contexts).  As you might expect, this is a bit of a head-scratcher.

I spent most of last week attending four of the five days of the workshop “Categories, Quanta, Concepts”, at the Perimeter Institute.  In the next few days I plan to write up many of the talks, but it was quite a lot.  For the moment, I’d like to do a little writeup on the talk I gave.  I wasn’t originally expecting to speak, but the organizers wanted the grad students and postdocs who weren’t talking in the scheduled sessions to give little talks.  So I gave a short version of this one which I gave in Ottawa but as a blackboard talk, so I have no slides for it.

Now, the workshop had about ten people from Oxford’s Comlab visiting, including Samson Abramsky and Bob Coecke, Marni Sheppard, Jamie Vicary, and about half a dozen others.  Many folks in this group work in the context of dagger compact categories, which is a nice abstract setting that captures a lot of the features of the category Hilb which are relevant to quantum mechanics.  Jamie Vicary had, earlier that day, given a talk about n-dimensional TQFT’s and n-categories – specifically, n-Hilbert spaces.  I’ll write up their talks in a later,  but it was a nice context in which to give the talk.

The point of this talk is to describe, briefly, Span(Gpd) – as a category and as a 2-category; to explain why it’s a good conceptual setting for quantum theory; and to show how it bridges the gap between Hilbert spaces and 2-Hilbert spaces.

History and Symmetry

In the course of an afternoon discussion session, we were talking about the various approaches people are taking in fundamentals of quantum theory, and in trying to find a “quantum theory of gravity” (whatever that ends up meaning).  I raised a question about robust ideas: basically, it seems to me that if an idea shows up across many different domains, that’s probably a sign it belongs in a good theory.  I was hoping people knew of a number of these notions, because there are really only two I’ve seen in this light, and really there probably should be more.

The two physical  notions that motivate everything here are (1) symmetry, and (2) emphasis on histories.  Both ideas are applied to states: states have symmetries; histories link starting states to ending states.  Combining them suggests histories should have symmetries of their own, which ought to get along with the symmetries of the states they begin and end with.

Both concepts are rather fundamental. Hermann Weyl wrote a whole book, “Symmetry”, about the first, and wrote: As far as I can see, all a-priori statements in physics are based on symmetry. From diffeomorphism invariance in general relativity, to gauge symmetry in quantum field theory, to symmetric tensor products involved in Fock space, through classical examples like Noether’s theorem. Noether’s theorem is also about histories: it applies when a symmetry holds along an entire history of a system: in fact, Langrangian mechanics generally is all about histories, and how they’re selected to be “real” in a classical system (by having a critical value of the action functional). The Lagrangian point of view appears in quantum theory (and this was what Richard Feynman did in his thesis) as the famous “sum over histories”, or path integral. General relativity embraces histories as real – they’re spacetimes, which is what GR is all about. So these concepts seem to hold up rather well across different contexts.

I began by drawing this table:

Sets Span(Sets) \rightarrow Rel
Grpd Span(Grpd)

The names are all those of categories. Moving left to right moves from a category describing collections of states, to one describing states-and-histories. It so happens that it also takes a cartesian category (or 2-category) to a symmetric monoidal one. Moving from top to bottom goes from a setting with no symmetry to one with symmetry. In both cases, the key concept is naturally expressed with a category, and shows up in morphisms. Now, since groupoids are already categories, both of the bottom entries properly ought to be 2-categories, but when we choose to, we can ignore that fact.

Why Spans?

I’ve written a bunch on spans here before, but to recap, a span in a category C is a diagram like: X \stackrel{s}{\leftarrow} H \stackrel{t}{\rightarrow} Y. Say we’re in Sets, so all these objects are sets: we interpret X and Y as sets of states. Each one describes some system by collecting all its possible (“pure”) states. (To be better, we could start with a different base category – symplectic manifolds, say – and see if the rest of the analysis goes through). For now, we just realize that H is a set of histories leading the system X to the system Y (notice there’s no assumption the system is the same). The maps s,t are source and target maps: they specify the unique state where a history h \in H starts and where it ends.

If C has pullbacks (or at least any we may need), we can use them to compose spans:

X \stackrel{s_1}{\leftarrow} H_1 \stackrel{t_1}{\rightarrow} Y \stackrel{s_2}{\leftarrow} H_2 \stackrel{t_2}{\rightarrow} Z \stackrel{\circ}{\Longrightarrow} X \stackrel{S}{\leftarrow} H_1 \times_Y H_2 \stackrel{T}{\rightarrow} Z

The pullback H_1 \times_Y H_2 – a fibred product if we’re in Sets – picks out pairs of histories in H_1 \times H_2 which match at Y. This should be exactly the possible histories taking X to Z.

I’ve included an arrow to the category Rel: this is the category whose objects are sets, and whose morphisms are relations. A number of people at CQC mentioned Rel as an example of a monoidal category which supports toy models having some but not all features of quantum mechanics. It happens to be a quotient of Span(Sets). A relation is an equivalence class of spans, where we only notice whether the set of histories connecting x \in X to y \in Y is empty or not. Span(Sets) is more like quantum mechanics, because its composition is just like matrix multiplication: counting the number of histories from x to y turns the span into a |X| \times |Y| matrix – so we can think of X and Y as being like vector spaces.

In fact, there’s a map L : Span(Sets) \rightarrow Hilb taking an object X to \mathbb{C}^X and a span to the matrix I just mentioned, which faithfully represents Span(Sets). A more conceptual way to say this is: a function f : X \rightarrow \mathbb{C} can be transported across the span. It lifts to H as f \circ s : H \rightarrow \mathbb{C}. Getting down the other leg, we add all the contributions of each history ending at a given y: t_*(s \circ f) = \sum_{t(h)=y} f \circ s (h).

This “sum over histories” is what matrix multiplication actually is.

Why Groupoids?

The point of groupoids is that they represent sets with a notion of (local) symmetry. A groupoid is a category with invertible morphisms. Each such isomorphism tells us that two states are in some sense “the same”. The beginning example is the “action groupoid” that comes from a group G acting on a set X, which we call X /\!\!/ G (or the “weak quotient” of X by G).

This suggests how groupoids come into the physical picture – the intuition is that X is the set (or, in later variations, space) of states, and G is a group of symmetries.  For example, G could be a group of coordinate transformations: states which can be transformed into each other by a rotation, say, are formally but not physically different.  The Extended TQFT example comes from the case where X is a set of connections, and G the group of gauge transformations.  Of course, not all physically interesting cases come from a single group action: for the harmonic oscillator, the states (“pure states”) are just energy levels – nonnegative integers.  On each state n, there is an action of the permutation group S_n – a “local” symmetry.

One nice thing about groupoids is that one often really only wants to think about them up to equivalence – as a result, it becomes a matter of convention whether formally different but physically indistinguishable states are really considered different.  There’s a side effect, though: Gpd is a 2-category.  In particular, this has two consequences for Span(Gpd): it ought to have 2-morphisms, so we stop thinking about spans up to isomorphism.  Instead, we allow spans of span maps as 2-morphisms.  Also, when composing spans (which are no longer taken up to isomorphism) we have to use a weak pullback, not an ordinary one.  I didn’t have time to say much about the 2-morphism level in the CQC talk, but the slides above do.

In any case, moving into Span(Gpd) means that the arrows in the spans are now functors – in particular, a symmetry of a historyh  now has to map to a symmetry of the start and end states, s(h) and t(h).  In particular, the functors give homomorphisms of the symmetry groups of each object.

Physics in Hilb and 2Hilb

So the point of the above is really to motivate the claim that there’s a clear physical meaning to groupoids (states and symmetries), and spans of them (putting histories on an even footing with states).  There’s less obvious physical meaning to the usual setting of quantum theory, the category Hilb – but it’s a slightly nicer category than Span(Gpd).  For one thing, there is a concept of a “dual” of a span – it’s the same span, with the roles of s and t interchanged.  However (as Jamie Vicary pointed out to me), it’s not an “adjoint” in Span(Gpd) in the technical sense.  In particular, Span(Gpd) is a symmetric monoidal category, like Hilb, but it’s not “dagger compact”, the kind of category all the folks from Oxford like so much.

Now, groupoidification lets us generalize the map L : Span(Sets) \rightarrow Hilb to groupoids making as few changes as possible.  We still use Hilbert space \mathbb{C}^X, but now X is the set of isomorphism classes of objects in the groupoid.  The “sum over histories” – in other words, the linear map associated to a span – is found in almost the same way, but histories now have “weights” found using groupoid cardinality (see any of the papers on groupoidification, or my slides above, for the details).  This reproduces a lot of known physics (see my paper on the harmonic oscillator; TQFT’s can also be defined this way).

While this is “as much like” linearization of Span(Set) as possible in some sense, it’s not exactly analogous.  It also is rather violent to the structure of the groupoids: at the level of objects it treats X /\!\!/ G as X/G. At the morphism level, it ignores everything about the structure of symmetries in the system except how many of them there are.   Since a groupoid is a category, the more direct analogy for \mathbb{C}^X – the set of functions (fancier versions use, say, L^2 functions only) from X to \mathbb{C} is Hilb^G – the category of functors from a groupoid into Hilb.  That is, representations of X.

One of the attractions here is that, because of a generalization of Tanaka-Krein duality, this category will actually be enough to reconstruct the groupoid if it’s reasonably nice.  The representation of Span(Gpd) in 2Hilb, unlike in Hilb is actually faithful for objects, at least for compact or finite groupoids.

Then you can “pull and push” a representationF across a span to get t_*(F \circ s) – using t_*, the adjoint functor to pulling back.  This is the 1-morphism level of the 2-functor I call \Lambda, generalizing the functor L in the world of sets.  The result is still a “direct sum over histories” – but because we’re dealing with pushing representations through homomorphisms, this adjoint is a bit more complicated than in the 0-category world of \mathbb{C}.  (See my slides or paper for the details).  But it remains true that the weights and so forth used in ordinary groupoidification show up here at the level of 2-morphisms.  So the representation in 2Hilb is not a faithful representation of the (intuitively meaningful) category Span(Gpd) either.  But it does capture a fair bit more than Hilbert spaces.

One point of my talk was to try to motivate the use of 2-Hilbert spaces in physics from an a-priori point of view.  One thing I think is nice, for this purpose, is to see how our physical intuitions motivate Span(Gpd) – a nice point itself – and then observe that there is this “higher level” span around:

Hilb \stackrel{|\cdot |}{\leftarrow} Span(Gpd) \stackrel{\Lambda}{\rightarrow} 2Hilb

Further Thoughts

Where can one take this?  There seem to be theories whose states and symmetries naturally want to form n-groupoids: in “higher gauge theory“, a sort of  gauge theory for categorical groups, one would have connections as states, gauge transformations as symmetries, and some kind of  “symmetry of symmetries”, rather as 2-categories have functors, natural transformations between them, and modifications of these.  Perhaps these could be organized into n-dimensional spans-of-spans-of-spans… of n-groupoids.  Then representations of an n-groupoid – namely, n-functors into (n-1)-Hilb – could be subjected to the kind of “pull-push” process we’ve just looked at.

Finally, part of the point here was to see how some fundamental physical notions – symmetry and histories – appear across physics, and lead to Span(Gpd).  Presumably these two aren’t enough.  The next principle that looks appealing – because it appears across domains – is some form of an action principle.

But that would be a different talk altogether.

As promised in the previous post, here is a little writeup of the second conference I was at recently…

Connections in Geometry and Physics

The conference at PI was an interestingly varied cross-section of talks, with a good many of them about geometry which, to be honest, is a little over my head.  Ostensibly about “connections”, the talks actually ranged quite widely, which was interesting, and reminded me I have a lot af geometry to catch up on.  A lot of talks had to do with structures at various places along the heirarchy: (1) symplectic manifolds, (2) Kähler manifolds, and (3) Calabi-Yau manifolds.  These last are interesting to string theorists and others, in part because they satisfy a form of Einstein’s equations, while also carrying a bunch of extra structure.

Now, at least I know what all the above things are: Symplectic manifolds (M,\omega) have the “symplectic form” \omega, a non-degenerate exact 2-form (a canonical example being \sum dp^i \wedge dq^i in the cotangent space to \mathbb{R}^n, which happens to be the configuration space for a particle moving in \mathbb{R}^n – symplectic forms often show up on configuration spaces).  A Kähler manifold is symplectic, but also has a complex structure (i.e. a way to multiply tangent vectors by i), which preserves the symplectic form, and a metric, which gets along with both of the above.  If the metric satisfies Einstein’s equations and is flat (this really amounts to the connection to “connections”, since this is the same as there being some flat connections, namely the Levi-Civita connection), then M is a Calabi-Yau manifold.

Anyway, this sets up the kind of geometry a lot of people were talking about, and while I didn’t exactly have the background to follow everything, I got a sense of what kinds of questions people are interested in, which was good.  A lot of questions have to do with Lagrangian submanifolds of any of the above (from symplectic through Calabi-Yau).  These are submanifolds where the symplectic form gives zero when applied to any tangent, and which have the highest possible dimension consistent with this property (namely n, if the original thing is 2n-dimensional).  Another theme which came up several times – for example, in the talk by Denis Auroux – has to do with “mirror symmetry” for Kähler manifolds (and Calabi-Yaus), which has to do with finding a “mirror” for the manifold M, called \check{M} where the complex geometry on the mirror corresponds to the symplectic geometry on M, and vice versa.

There were some talks in the direction of physics.  One of the most obviously physical was Niky Kamran’s, talking about a project he’s worked on with F. Finster, J. Smoller, and S-T. Yau, about long-time dynamics of particles satisfying the Dirac equation, living on a background geometry described by the Kerr metric – which describes a rotating black hole.  Since I worked with Niky on a related project for my M.Sc (my thesis was basically a summary putting together a bunch of results by these same four people), I followed this talk better than many of the others.

Working on this project, I got a strong sense of how important symmetry is in studying a lot of real-world problems.  One of the essential facts about the Kerr metric is that it’s very symmetric: it’s stable in time, and rotationally symmetric.  Actually, all the black-hole solutions to Einstein’s equations are quite symmetric – there is only a small family of solutions, parametrized by mass and angular momentum (and electrical charge).  The symmetry makes differential equations written in terms of this metric much nicer – you can split things into the radial and angular parts, for example – and in particular, the wave equations Niky was talking about are integrable just because of this symmetry, so it’s possible to get exact analytic results.  (Other approaches to this kind of problem get results only numerically and approximately, but can deal with much more general backgrounds.)  The starting point (which basically is what my thesis summarizes) is to show that there are no “bound states” for the Dirac equation.  Fermions (which is what it describes) are most familiar to us in bound states: in shells orbiting the nucleus of an atom.  But if the attractive force pulling on them is gravity, rather than electical charge, this situation isn’t stable.  The work Niky was talking about deals with what happens instead: what are the long-term dynamics of a fermion near a rotating black hole?

They use spectral methods – basically, Fourier analysis – to find out.  The Dirac equation is a wave equation (for a spinor field), and you can look at the different frequencies, and get an estimate of how fast they decay.  (Since there aren’t stable orbits, the strength of the spinor field has to decay over time.)  In fact, they get a sharp estimate of the order (namely t^{-5/6}).  Basically, one should imagine that the wave is a superposition of “ripples” – some radiating outward from the event horizon, and some converging toward it.  Put in terms of a particle – an electron, say, or a neutrino – this says it will either fall into the black hole, or (if it has enough energy) escape off to infinity.

There were some other physics-ish talks, such as that by James Sparks, on the geometry of the “AdS/CFT” correspondence.  This correspondence has to do with two kinds of quantum field theory.  The “AdS” stands for “Anti de Sitter”, which is a sort of geometric structure for a manifold which resembles a hyperboloid – actually, all the unit vectors in \mathbb{R}^6 where the metric has signature (4,2): that is, the metric is something like \Delta(1,1,1,1,-1,-1).  This hyperboloid is 5-dimensional, and has a metric with one timelike dimension.  Plain old “de Sitter” space is a similar thing, but using a metric with signature (5,1).  It’s possible to define some field theory on AdS space, called supersymmetric supergravity.  This theory turns out to have exactly the same algebra of observables as a different theory, “CFT” or conformal field theory, on the (conformal) boundary of Anti de Sitter space.  Sparks told us about a geometric interpretation of this.

Then there was Sergei Gukov, with a talk called “Brane Quantization”, based on this work with Ed Witten.  He was a little reticent to actually describe how this “brane quantization” actually works, preferring to refer us to that paper, but gave us a very nice, and relatively comprehensible overview of different approaches to quantizing a symplectic manifold.  (As I said, they tend to show up as configuration spaces in classical physics. A basic problem of quantization is how to turn the algebra of functions on a symplectic manifold (M,\omega) into an algebra of operators on a Hilbert space \mathcal{H}.)  In particular, he contrasted their method with geometric quantization (which needs to make some arbitrary choices, then takes \mathcal{H} to be a space of sections of some line bundle on M with a connection whose curvature is \omega), and with deformation quantization (which needs no special choices, but only constructs an algebra of operators by algebraic deformation, and not actually \mathcal{H} itself, which some people, but not Sergei Gukov, find satisfactory).  The basic idea of Brane quantization seems to be that M gets complexified (somehow – it might be either impossible, or non-unique), and then studying something called an A-model of the result.  This is apparently related to, for example, Gromov-Witten theory, which I’ve written about here recently.

Finally, I’ll mention a few other talks which stood out as rather different from the rest.  Veronique Godin talked about “Relative String Topology” – string topology being a way of studying space by looking at embeddings of the circle (or of paths) into it – that is, its loop space (or path space).  Usually, invariants that come from path spaces only detect the homotopy type of the original spaces – in particular, they’re not helpful as knot invariants.  Godin talked about a clever way to detect more structure by means of an A_{\infty}-coalgebra structure on the cohomology groups of the path space.  The “relative” part means one’s looking at a manifold M with embedded submanifold N (for example, N is a knot in M=\mathbb{R}^3), and considering only paths starting and ending on N.  (This is how one can get a coalgebra structure – turning one path into two paths if it crosses through N again is a comultiplication – this extends to chains in the cohomology).

Chris Brav gave a talk about how braid groups act on derived categories, which I didn’t entirely follow, but subsequently he did explain to me in a pretty comprehensible way what people are trying to accomplish when they look at derived categories.  At some point I’ll have to think about this more carefully and maybe post about it.  But roughly, it’s the same sort of “nice categorical properties” I mentioned in the previous post, about smooth spaces.  Looking at derived categories of sheaves on a space, makes the objects seem more complicated, but it also makes them behave better with respect to taking things like limits and colimits.

Benjamin Young prefaced his talk, “Combinatorics Inspired by Donaldson-Thomas Theory” by pointing out that he’s a combinatorialist, not a geometer.  But Donaldson-Thomas invariants are apparently a kind of “signed count” of some geometric structures (as are a lot of invariants – the same kind of “weighted count” invariants appear in Gromov-Witten and Dijkgraaf-Witten theory, just for instance).  So he described some geometry relating to “brane tilings” – basically, embedding certain kinds of graphis in a torus – and how they give rise to structures that correspond to certain kinds of Young diagrams (“not the same Young”, he added, perhaps unnecessarily, but it got a chuckle anyway).  So the counts can be turned into a combinatorial problem of counting those Young diagrams with the appropriate sign, which can be done using a generating function.

So in any case, this conference had a whole range of talks, from several different fields.  While I found myself lost in a number of talks, I was also quite fascinating with how wide a range of topics were embraced under its umbrella – “connections” indeed!  So in the end this was one of those conferences which opened my eyes to a wider view of the field, which was certainly a good reason to go!

Next Page »