### quantum mechanics

So I had a busy week from Feb 7-13, which was when the workshop Higher Gauge Theory, TQFT, and Quantum Gravity (or HGTQGR) was held here in Lisbon.  It ended up being a full day from 0930h to 1900h pretty much every day, except the last.  We’d tried to arrange it so that there were coffee breaks and discussion periods, but there was also a plethora of talks.  Most of the people there seemed to feel that it ended up pretty well.  Since then I’ve been occupied with other things – family visiting the country, for one, so it’s taken a while to get around to writing it up.  Since there were several parts to the event, I’ll do this in several parts as well, of which this is the first one.

Part of the point of the workshop was to bring together a few related subjects in which category theoretic ideas come into areas of mathematics which play a role in physics, and hopefully to build some bridges toward applications.  While it leaned pretty strongly on the mathematical side of this bridge, I think we did manage to get some interaction at the overlap.  Roger Picken drew a nifty picture on the whiteboard at the end of the workshop summarizing how a lot of the themes of the talks clustered around the three areas mentioned in the title, and suggesting how TQFT really does form something of a bridge between the other two – one reason it’s become a topic of some interest recently.  I’ll try to build this up to a similar punchline.

### Pre-School

Before the actual event began, though, we had a bunch of talks at IST for a local audience, to try to explain to mathematicians what the physics part of the workshop was about.  Aleksandr Mikovic gave a two-talk introduction to Quantum Gravity, and Sebastian Guttenberg gave a two-part intro to String Theory.  These are two areas where higher gauge theory (in the form of n-connections and n-bundles, or of n-gerbes) has made an appearance, and were the main physics content of the workshop talks.  They set up the basics to help put those talks in context.

Quantum Gravity

Aleksandr’s first talk set out the basic problem of quantizing the gravitational field (this isn’t the only attitude to what the problem of quantum gravity is, but it’s a good starting point), starting with the basic ingredients.  He summarized how general relativity describes gravity in terms of a metric $g_{\mu \nu}$ which is supposed to satisfy the Einstein equation, relating the curvature of the metric to a source field $T_{\mu \nu}$ which comes from matter.  Quantization then, starting from a classical picture involving trajectories of particles (or sections of fibre bundles to describe fields), one gets a picture where states are vectors in a Hilbert space, and there’s an algebra of operators including observables (self-adjoint operators) and time-evolution (hermitian ones).   An initial try at quantum gravity was to do this using the metric as the field, using the methods of perturbative QFT: treating the metric in terms of “small” fluctuations from some background metric like the flat Minkowski metric.  This uses the Einstein-Hilbert action $S=\frac{1}{G} \int \sqrt{det(g)}R$, where $G$ is the gravitational constant and $R$ is the Ricci scalar that summarizes the curvature of $g$.  This runs into problems: things diverge in various calculations, and since the coupling constant $G$ has units, one can’t “renormalize” the divergences away.  So one needs a non-perturbative approach,  one of which is “canonical quantization“.

After some choice of coordinates (so-called “lapse” and “shift” functions), this involves describing the action in terms of the (space part of) the metric $g_{kl}$ and some canonically conjugate “momentum” variables $\pi_{kl}$ which describe its extrinsic curvature.  The Euler-Lagrange equations (found as usual by variational calculus methods) then turn out to give the “Hamiltonian constraint” that certain functions of $g$ are always zero.  Then the program is to get a Poisson algebra giving commutators of the $\pi$ and $g$ variables, then turn it into an algebra of operators in a standard way.  This also runs into problems because the space of metrics isn’t a Hilbert space.  One solution is to not use the metric, but instead a connection and a “frame field” – the so-called Ashtekar variables for GR.  This works better, and gives the “Loop Quantum Gravity” setup, since observables tend to be expressed as holonomies around loops.

Finally, Aleksandr outlined the spin foam approach to quantizing gravity.  This is based on the idea of a quantum geometry as a network (graph) with edges labelled by spins, i.e. representations of SU(2) (which are labelled by half-integers).  Vertices labelled by intertwining operators (which imposes triangle inequalities, as it happens).  The spin foam approach takes a Hilbert space with a basis given by these spin networks.  These are supposed to be an alternative way of describing geometries given by SU(2)-connections. The representations arise because, as the Peter-Weyl theorem shows, they form a nice basis for $L^2(SU(2))$.  Then to get operators associated to “foams” that interpolate the spacetime between two such geometries (i.e. linear combinations of spin networks).  These are 2-complexes where faces are labelled with spins, and edges with intertwiners for the spins on the faces incident to them.  The operators arise from  a discrete variant of the Feynman path-integral, where time-evolution comes from integrating an action over a space of (classical) trajectories, which in this case are foams.  This needs an action to integrate – in the discrete world, this corresponds to ways of choosing weights $A_e$ for edges and $A_f$ for faces in a generic partition function:

$Z = \sum_{J,I} \prod_{faces} A_f(j_f) \prod_{edges}A_e(i_l)$

which is a sum over the labels for representations and intertwiners.  Some of the talks that came later in the conference (e.g. by Benjamin Bahr and Bianca Dittrich) came back to discuss principles behind how these $A$ functions could be chosen.  (Aristide Baratin’s talk described a similar but more general kind of model based on 2-groups.)

String Theory

In parallel with these, Sebastian Guttenberg gave us a two-lecture introduction to string theory.  His starting point is the intuition that a lot of classical physics studies particles living on a background of some field.  The field can be understood as an approximate way of talking about a large number of quantum-mechanical particles, rather as the dynamics of a large number of classical particles can be approximated by the equations of state for a fluid or gas (depending on how much they interact with one another, among other things).  In string theory and “string field theory”, we have a similar setup, except we replace the particles with small strings – either open strings (which look like intervals) or closed ones (which look like circles).

To begin with, he introduced the basic tools of “classical” string theory – the analog of classical mechanics of point particles.  This is the string analog of the following: one can describe a moving particle by its worldline – a path $x : \mathbb{R} \rightarrow M^{(D)}$ from a “generic” worldline into a ($D$-dimensional) manifold $M^{(D)}$.  This $M^{(D)}$ is generally taken to be like physical spacetime, which in this context means that it has a metric $g$ with signature $(-1,1,\dots,1)$ (that is, locally there’s a basis for tangent spaces with one timelike vector and $D-1$ spacelike ones).  Then one can define an action for a moving particle which is just determined by the length of the line’s image.  The nicest way to say this is $S[x] = m \int d\tau \sqrt{x*g}$, where $x*g$ means the pullback of the metric along the map $x$, $\tau$ is some parameter along the generic worldline, and $m$, the particle’s mass, is a coupling constant which doesn’t happen to affect the result in this simple case, but eventually becomes important.  One can do the usual variational-calculus of the Lagrangian approach here, finding a critical point of the action occurs when the particle is travelling in a geodesic – a straight line, in flat space, or the closest available approximation.  In paritcular, the Euler-Lagrange equations say that the covariant derivative of the path should be zero.

There’s an analogous action for a string, the Nambu-Goto action.  Instead of a single-parameter $x$, we now have an embedding of a “generic string worldsheet” – let’s say $\Sigma^{(2)} \cong S^1 \times \mathbb{R}$ into spacetime: $x : \Sigma^{(2)} \rightarrow M^{(D)}$.  Then then the analogous action is just $S[x] = \int_{\Sigma^{(2)}} \star_{x*g} 1$.  This is pretty much the same as before: we pull back the metric to get $x*g$, and integrate over the generic worldsheet.  A slight subtlety comes because we’re taking the Hodge dual $\star$.  This is conceptually clean, but expands out to a fairly big integral when you express it in coordinates, where the leading term  involves $\sqrt{det(\partial_{\mu} x^m \partial_{\nu} x^n g_{mn}}$ (the determinant is taken over $(\mu,\nu)$.  Varying this to get the equations of motion produces:

$0 = \partial_{\mu} \partial^{\mu} x^k + \partial_{\mu} x^m \partial^{\mu} x^n \Gamma_{mn}^k$

which is the two-dimensional analog of the geodesic equation for a point particle (the $\Gamma$ are the Christoffel symbols associated to the connection that goes with the metric).  The two-dimensional analog says we have a critical point for the area of the surface which is the image of $\Sigma^{(2)}$ – in fact, a “maximum”, given the sign of the metric.  For solutions like this, the pullback metric on the worldsheet, $x*g$, looks flat.  (Naturally, the metric looks flat along a geodesic, too, but this is stronger in 2 dimensions, where there can be intrinsic curvature.)

A souped up version of the Nambu-Goto action is the Polyakov action, which is a natural variation that comes up when $\Sigma^{(2)}$ has a metric of its own, $h$.  You can check out the details behind that link, but part of what makes this action nice is that the corresponding Euler-Lagrange equation from varying $h$ says that $x*g \sim h$.  That is, the worldsheet $\Sigma^{(2)}$ will have an image with a shape such that its own metric agrees with the one induced from the spacetime $M^{(D)}$.   This action is called the Polyakov action (even though it was introduced by Deser and Zumino, among others) because Polyakov used it for quantizing the string.

Other variations on this action add additional terms which represent fields which the string might be affected by: a scalar $\phi(x)$, and a 2-form field $B_{mn}(x)$ (here we’re using the physics convention where $x$ represents both the function, and its values at particular points, in this case, values of parameters $(\sigma_0,\sigma_1)$ on $\Sigma^{(2)}$).

That 2-form, the “B-field”, is an important field in string theory, and eventually links up with higher gauge theory, which we’ll get to as we go on: one can interpret the B-field as part of a higher connection, to which the string is coupled (as in Baez and Perez, say).  The scalar field $\phi$ essentially determines how strongly the shape of the string itself affects the action – it’s a “string coupling” term, or string coupling “constant” if it’s chosen to be just a number $\phi_0$.  (In such a case, the action includes a term that looks like $\phi_0$ times the Euler characteristic of the surface $\Sigma^{(2)}$.)

Sebastian briefly explained some of the physical intuition for why these are the kinds of couplings which it makes sense to introduce.  Essentially, any coupling one writes in coordinates has to get along with gauge symmetries, changes of coordinates, etc.  That is, there should be no physical difference between the class of solutions one finds in a given set of coordinates, and the coordinates one gets by doing some diffeomorphism on the spacetime $M^{(D)}$, or by changing the metric on $\Sigma^{(2)}$ by some conformal transformation $h_{\mu \nu} \mapsto exp(2 \omega(\sigma^0,\sigma^1)) h_{\mu \nu}$ (that is, scaling by some function of position on the worldsheet – underlying string theory is Conformal Field Theory in that the scale of the generic worldsheet is irrelevant – only the light-cones).  Anything a string couples to should be a field that transforms in a way that respects this.  One important upshot for the quantum theory is that when one quantizes a string coupled to such a field, this makes sure that time evolution is unitary.

How this is done is a bit more complicated than Sebastian wanted to go into in detail (and I got a little lost in the summary) so I won’t attempt to do it justice here.  The end results include a partition function:

$Z = \sum_{topologies} dx dh \frac{exp(-S[x,h])}{V_{diff} V_{weyl}}$

Remember: if one is finding amplitudes for various observables, the partition function is a normalizing factor, and finding the value of any observables means squeezing them into a similar-looking integral (and normalizing by this factor).  So this says that they’re found by summing over all the string topologies which go from the input to the output, and integrating over all embeddings $x : \Sigma^{(2)} \rightarrow M^{(D)}$ and metrics on $\Sigma^{(2)}$.  (The denominator in that fraction is dividing out by the volumes of the symmetry groups, as usual is quantum field theory since these symmetries mean one is “overcounting” physically identical situations.)

This is just the beginning of string field theory, of course: just as the dynamics of a free moving particle, or even a particle coupled to a background field, are only the beginning of quantum field theory.  But many later additions can be understood as adding various terms to the action $S$ in some such formalism.  These would be analogs of giving a point-particle attributes like charge, spin, “colour” and so forth in the Standard Model: these define how it couples to, hence is affected by, various kinds of fields.  Such fields can be understood in terms of connections (or, in general, higher connections, as we’ll get to later), which define how structures are “parallel-transported” along a path (or higher-dimensional surface).

Coming up in In Part II… I’ll summarize the School portion of the HGTQGR workshop, including lecture series by: Christopher Schommer-Pries on Classifying 2D Extended TQFT, which among other things explained Chris’ proof of the Cobordism Hypothesis using Cerf theory; Tim Porter on Homotopy QFT and the “Crossed Menagerie”, which describe a general framework for talking about quantum theories on cobordisms with structure; John Huerta on Higher Gauge Theory, which gave an introductory account of 2-groups and 2-bundles with 2-connections; Christoph Wockel on connections between Higher Gauge Theory and Infinite Dimensional Lie Theory, which described how some infinite-dimensional Lie algebras can’t be integrated to Lie groups, but only to 2-groups; and one by Hisham Sati on Higher Spin Structures in String Theory, which among other things described how cohomological obstructions to putting certain kinds of structure on manifolds motivates the use of particular higher dimensions.

Marco Mackaay recently pointed me at a paper by Mikhail Khovanov, which describes a categorification of the Heisenberg algebra $H$ (or anyway its integral form $H_{\mathbb{Z}}$) in terms of a diagrammatic calculus.  This is very much in the spirit of the Khovanov-Lauda program of categorifying Lie algebras, quantum groups, and the like.  (There’s also another one by Sabin Cautis and Anthony Licata, following up on it, which I fully intend to read but haven’t done so yet. I may post about it later.)

Now, as alluded to in some of the slides I’ve from recent talks, Jamie Vicary and I have been looking at a slightly different way to answer this question, so before I talk about the Khovanov paper, I’ll say a tiny bit about why I was interested.

Groupoidification

The Weyl algebra (or the Heisenberg algebra – the difference being whether the commutation relations that define it give real or imaginary values) is interesting for physics-related reasons, being the algebra of operators associated to the quantum harmonic oscillator.  The particular approach to categorifying it that I’ve worked with goes back to something that I wrote up here, and as far as I know, originally was suggested by Baez and Dolan here.  This categorification is based on “stuff types” (Jim Dolan’s term, based on “structure types”, a.k.a. Joyal’s “species”).  It’s an example of the groupoidification program, the point of which is to categorify parts of linear algebra using the category $Span(Gpd)$.  This has objects which are groupoids, and morphisms which are spans of groupoids: pairs of maps $G_1 \leftarrow X \rightarrow G_2$.  Since I’ve already discussed the backgroup here before (e.g. here and to a lesser extent here), and the papers I just mentioned give plenty more detail (as does “Groupoidification Made Easy“, by Baez, Hoffnung and Walker), I’ll just mention that this is actually more naturally a 2-category (maps between spans are maps $X \rightarrow X'$ making everything commute).  It’s got a monoidal structure, is additive in a fairly natural way, has duals for morphisms (by reversing the orientation of spans), and more.  Jamie Vicary and I are both interested in the quantum harmonic oscillator – he did this paper a while ago describing how to construct one in a general symmetric dagger-monoidal category.  We’ve been interested in how the stuff type picture fits into that framework, and also in trying to examine it in more detail using 2-linearization (which I explain here).

Anyway, stuff types provide a possible categorification of the Weyl/Heisenberg algebra in terms of spans and groupoids.  They aren’t the only way to approach the question, though – Khovanov’s paper gives a different (though, unsurprisingly, related) point of view.  There are some nice aspects to the groupoidification approach: for one thing, it gives a nice set of pictures for the morphisms in its categorified algebra (they look like groupoids whose objects are Feynman diagrams).  Two great features of this Khovanov-Lauda program: the diagrammatic calculus gives a great visual representation of the 2-morphisms; and by dealing with generators and relations directly, it describes, in some sense1, the universal answer to the question “What is a categorification of the algebra with these generators and relations”.  Here’s how it works…

Heisenberg Algebra

One way to represent the Weyl/Heisenberg algebra (the two terms refer to different presentations of isomorphic algebras) uses a polynomial algebra $P_n = \mathbb{C}[x_1,\dots,x_n]$.  In fact, there’s a version of this algebra for each natural number $n$ (the stuff-type references above only treat $n=1$, though extending it to “$n$-sorted stuff types” isn’t particularly hard).  In particular, it’s the algebra of operators on $P_n$ generated by the “raising” operators $a_k(p) = x_k \cdot p$ and the “lowering” operators $b_k(p) = \frac{\partial p}{\partial x_k}$.  The point is that this is characterized by some commutation relations.  For $j \neq k$, we have:

$[a_j,a_k] = [b_j,b_k] = [a_j,b_k] = 0$

but on the other hand

$[a_k,b_k] = 1$

So the algebra could be seen as just a free thing generated by symbols $\{a_j,b_k\}$ with these relations.  These can be understood to be the “raising and lowering” operators for an $n$-dimensional harmonic oscillator.  This isn’t the only presentation of this algebra.  There’s another one where $[p_k,q_k] = i$ (as in $i = \sqrt{-1}$) has a slightly different interpretation, where the $p$ and $q$ operators are the position and momentum operators for the same system.  Finally, a third one – which is the one that Khovanov actually categorifies – is skewed a bit, in that it replaces the $a_j$ with a different set of $\hat{a}_j$ so that the commutation relation actually looks like

$[\hat{a}_j,b_k] = b_{k-1}\hat{a}_{j-1}$

It’s not instantly obvious that this produces the same result – but the $\hat{a}_j$ can be rewritten in terms of the $a_j$, and they generate the same algebra.  (Note that for the one-dimensional version, these are in any case the same, taking $a_0 = b_0 = 1$.)

Diagrammatic Calculus

To categorify this, in Khovanov’s sense (though see note below1), means to find a category $\mathcal{H}$ whose isomorphism classes of objects correspond to (integer-) linear combinations of products of the generators of $H$.  Now, in the $Span(Gpd)$ setup, we can say that the groupoid $FinSet_0$, or equvialently $\mathcal{S} = \coprod_n \mathcal{S}_n$, represents Fock space.  Groupoidification turns this into the free vector space on the set of isomorphism classes of objects.  This has some extra structure which we don’t need right now, so it makes the most sense to describe it as $\mathbb{C}[[t]]$, the space of power series (where $t^n$ corresponds to the object $[n]$).  The algebra itself is an algebra of endomorphisms of this space.  It’s this algebra Khovanov is looking at, so the monoidal category in question could really be considered a bicategory with one object, where the monoidal product comes from composition, and the object stands in formally for the space it acts on.  But this space doesn’t enter into the description, so we’ll just think of $\mathcal{H}$ as a monoidal category.  We’ll build it in two steps: the first is to define a category $\mathcal{H}'$.

The objects of $\mathcal{H}'$ are defined by two generators, called $Q_+$ and $Q_-$, and the fact that it’s monoidal (these objects will be the categorifications of $a$ and $b$).  Thus, there are objects $Q_+ \otimes Q_- \otimes Q_+$ and so forth.  In general, if $\epsilon$ is some word on the alphabet $\{+,-\}$, there’s an object $Q_{\epsilon} = Q_{\epsilon_1} \otimes \dots \otimes Q_{\epsilon_m}$.

As in other categorifications in the Khovanov-Lauda vein, we define the morphisms of $\mathcal{H}'$ to be linear combinations of certain planar diagrams, modulo some local relations.  (This type of formalism comes out of knot theory – see e.g. this intro by Louis Kauffman).  In particular, we draw the objects as sequences of dots labelled $+$ or $-$, and connect two such sequences by a bunch of oriented strands (embeddings of the interval, or circle, in the plane).  Each $+$ dot is the endpoint of a strand oriented up, and each $-$ dot is the endpoint of a strand oriented down.  The local relations mean that we can take these diagrams up to isotopy (moving the strands around), as well as various other relations that define changes you can make to a diagram and still represent the same morphism.  These relations include things like:

which seems visually obvious (imagine tugging hard on the ends on the left hand side to straighten the strands), and the less-obvious:

and a bunch of others.  The main ingredients are cups, caps, and crossings, with various orientations.  Other diagrams can be made by pasting these together.  The point, then, is that any morphism is some $\mathbf{k}$-linear combination of these.  (I prefer to assume $\mathbf{k} = \mathbb{C}$ most of the time, since I’m interested in quantum mechanics, but this isn’t strictly necessary.)

The second diagram, by the way, are an important part of categorifying the commutation relations.  This would say that $Q_- \otimes Q_+ \cong Q_+ \otimes Q_- \oplus 1$ (the commutation relation has become a decomposition of a certain tensor product).  The point is that the left hand sides show the composition of two crossings $Q_- \otimes Q_+ \rightarrow Q_+ \otimes Q_-$ and $Q_+ \otimes Q_- \rightarrow Q_- \otimes Q_+$ in two different orders.  One can use this, plus isotopy, to show the decomposition.

That diagrams are invariant under isotopy means, among other things, that the yanking rule holds:

(and similar rules for up-oriented strands, and zig zags on the other side).  These conditions amount to saying that the functors $- \otimes Q_+$ and $- \otimes Q_-$ are two-sided adjoints.  The two cups and caps (with each possible orientation) give the units and counits for the two adjunctions.  So, for instance, in the zig-zag diagram above, there’s a cup which gives a unit map $\mathbf{k} \rightarrow Q_- \otimes Q_+$ (reading upward), all tensored on the right by $Q_-$.  This is followed by a cap giving a counit map $Q_+ \otimes Q_- \rightarrow \mathbf{k}$ (all tensored on the left by $Q_-$).  So the yanking rule essentially just gives one of the identities required for an adjunction.  There are four of them, so in fact there are two adjunctions: one where $Q_+$ is the left adjoint, and one where it’s the right adjoint.

Karoubi Envelope

Now, so far this has explained where a category $\mathcal{H}'$ comes from – the one with the objects $Q_{\epsilon}$ described above.  This isn’t quite enough to get a categorification of $H_{\mathbb{Z}}$: it would be enough to get the version with just one $a$ and one $b$ element, and their powers, but not all the $a_j$ and $b_k$.  To get all the elements of the (integral form of) the Heisenberg algebras, and in particular to get generators that satisfy the right commutation relations, we need to introduce some new objects.  There’s a convenient way to do this, though, which is to take the Karoubi envelope of $\mathcal{H}'$.

The Karoubi envelope of any category $\mathcal{C}$ is a universal way to find a category $Kar(\mathcal{C})$ that contains $\mathcal{C}$ and for which all idempotents split (i.e. have corresponding subobjects).  Think of vector spaces, for example: a map $p \in End(V)$ such that $p^2 = p$ is a projection.  That projection corresponds to a subspace $W \subset V$, and $W$ is actually another object in $Vect$, so that $p$ splits (factors) as $V \rightarrow W subset V$.  This might not happen in any general $\mathcal{C}$, but it will in $Kar(\mathcal{C})$.  This has, for objects, all the pairs $(C,p)$ where $p : C \rightarrow C$ is idempotent (so $\mathcal{C}$ is contained in $Kar(\mathcal{C})$ as the cases where $p=1$).  The morphisms $f : (C,p) \rightarrow (C',p')$ are just maps $f : C \rightarrow C'$ with the compatibility condition that $p' f = p f = f$ (essentially, maps between the new subobjects).

So which new subobjects are the relevant ones?  They’ll be subobjects of tensor powers of our $Q_{\pm}$.  First, consider $Q_{+^n} = Q_+^{\otimes n}$.  Obviously, there’s an action of the symmetric group $\mathcal{S}_n$ on this, so in fact (since we want a $\mathbf{k}$-linear category), its endomorphisms contain a copy of $\mathbf{k}[\mathcal{S}_n]$, the corresponding group algebra.  This has a number of different projections, but the relevant ones here are the symmetrizer,:

$e_n = \frac{1}{n!} \sum_{\sigma \in \mathcal{S}_n} \sigma$

which wants to be a “projection onto the symmetric subspace” and the antisymmetrizer:

$e'_n = \frac{1}{n!} \sum_{\sigma \in \mathcal{S}_n} sign(\sigma) \sigma$

which wants to be a “projection onto the antisymmetric subspace” (if it were in a category with the right sub-objects). The diagrammatic way to depict this is with horizontal bars: so the new object $S^n_+ = (Q_{+^n}, e)$ (the symmetrized subobject of $Q_+^{\oplus n}$) is a hollow rectangle, labelled by $n$.  The projection from $Q_+^{\otimes n}$ is drawn with $n$ arrows heading into that box:

The antisymmetrized subobject $\Lambda^n_+ = (Q_{+^n},e')$ is drawn with a black box instead.  There are also $S^n_-$ and $\Lambda^n_-$ defined in the same way (and drawn with downward-pointing arrows).

The basic fact – which can be shown by various diagram manipulations, is that $S^n_- \otimes \Lambda^m_+ \cong (\Lambda^m_+ \otimes S^n_-) \oplus (\Lambda_+^{m-1} \otimes S^{n-1}_-)$.  The key thing is that there are maps from the left hand side into each of the terms on the right, and the sum can be shown to be an isomorphism using all the previous relations.  The map into the second term involves a cap that uses up one of the strands from each term on the left.

There are other idempotents as well – for every partition $\lambda$ of $n$, there’s a notion of $\lambda$-symmetric things – but ultimately these boil down to symmetrizing the various parts of the partition.  The main point is that we now have objects in $\mathcal{H} = Kar(\mathcal{H}')$ corresponding to all the elements of $H_{\mathbb{Z}}$.  The right choice is that the $\hat{a}_j$  (the new generators in this presentation that came from the lowering operators) correspond to the $S^j_-$ (symmetrized products of “lowering” strands), and the $b_k$ correspond to the $\Lambda^k_+$ (antisymmetrized products of “raising” strands).  We also have isomorphisms (i.e. diagrams that are invertible, using the local moves we’re allowed) for all the relations.  This is a categorification of $H_{\mathbb{Z}}$.

Some Generalities

This diagrammatic calculus is universal enough to be applied to all sorts of settings where there are functors which are two-sided adjoints of one another (by labelling strands with functors, and the regions of the plane with categories they go between).  I like this a lot, since biadjointness of certain functors is essential to the 2-linearization functor $\Lambda$ (see my link above).  In particular, $\Lambda$ uses biadjointness of restriction and induction functors between representation categories of groupoids associated to a groupoid homomorphism (and uses these unit and counit maps to deal with 2-morphisms).  That example comes from the fact that a (finite-dimensional) representation of a finite group(oid) is a functor into $Vect$, and a group(oid) homomorphism is also just a functor $F : H \rightarrow G$.  Given such an $F$, there’s an easy “restriction” $F^* : Fun(G,Vect) \rightarrow Fun(H,Vect)$, that just works by composing with $F$.  Then in principle there might be two different adjoints $Fun(H,Vect) \rightarrow Fun(G,Vect)$, given by the left and right Kan extension along $F$.  But these are defined by colimits and limits, which are the same for (finite-dimensional) vector spaces.  So in fact the adjoint is two-sided.

Khovanov’s paper describes and uses exactly this example of biadjointness in a very nice way, albeit in the classical case where we’re just talking about inclusions of finite groups.  That is, given a subgroup $H < G$, we get a functors $Res_G^H : Rep(G) \rightarrow Rep(H)$, which just considers the obvious action of $H$ act on any representation space of $G$.  It has a biadjoint $Ind^G_H : Rep(H) \rightarrow Rep(G)$, which takes a representation $V$ of $H$ to $\mathbf{k}[G] \otimes_{\mathbf{k}[H]} V$, which is a special case of the formula for a Kan extension.  (This formula suggests why it’s also natural to see these as functors between module categories $\mathbf{k}[G]-mod$ and $\mathbf{k}[H]-mod$).  To talk about the Heisenberg algebra in particular, Khovanov considers these functors for all the symmetric group inclusions $\mathcal{S}_n < \mathcal{S}_{n+1}$.

Except for having to break apart the symmetric groupoid as $S = \coprod_n \mathcal{S}_n$, this is all you need to categorify the Heisenberg algebra.  In the $Span(Gpd)$ categorification, we pick out the interesting operators as those generated by the $- \sqcup \{\star\}$ map from $FinSet_0$ to itself, but “really” (i.e. up to equivalence) this is just all the inclusions $\mathcal{S}_n < \mathcal{S}_{n+1}$ taken at once.  However, Khovanov’s approach is nice, because it separates out a lot of what’s going on abstractly and uses a general diagrammatic way to depict all these 2-morphisms (this is explained in the first few pages of Aaron Lauda’s paper on ambidextrous adjoints, too).  The case of restriction and induction is just one example where this calculus applies.

There’s a fair bit more in the paper, but this is probably sufficient to say here.

1 There are two distinct but related senses of “categorification” of an algebra $A$ here, by the way.  To simplify the point, say we’re talking about a ring $R$.  The first sense of a categorification of $R$ is a (monoidal, additive) category $C$ with a “valuation” in $R$ that takes $\otimes$ to $\times$ and $\oplus$ to $+$.  This is described, with plenty of examples, in this paper by Rafael Diaz and Eddy Pariguan.  The other, typical of the Khovanov program, says it is a (monoidal, additive) category $C$ whose Grothendieck ring is $K_0(C) = R$.  Of course, the second definition implies the first, but not conversely.  The objects of the Grothendieck ring are isomorphism classes in $C$.  A valuation may identify objects which aren’t isomorphic (or, as in groupoidification, morphisms which aren’t 2-isomorphic).

So a categorification of the first sort could be factored into two steps: first take the Grothendieck ring, then take a quotient to further identify things with the same valuation.  If we’re lucky, there’s a commutative square here: we could first take the category $C$, find some surjection $C \rightarrow C'$, and then find that $K_0(C') = R$.  This seems to be the relation between Khovanov’s categorification of $H_{\mathbb{Z}}$ and the one in $Span(Gpd)$. This is the sense in which it seems to be the “universal” answer to the problem.

In the first week of November, I was in Montreal for the biannual meeting of the Philosophy of Science Association, at the invitation of Hans Halvorson and Steve Awodey.  This was for a special session called “Category Theoretical Reflections on the Foundations of Physics”, which also had talks by Bob Coecke (from Oxford), Klaas Landsman (from Radboud University in Nijmegen), and Gonzalo Reyes (from the University of Montreal).  Slides from the talks in this session have been collected here by Steve Awodey.  The meeting was pretty big, and there were a lot of talks on a lot of different topics, some more technical, and some less.  There were enough sessions relating to physics that I had a full schedule just attending those, although for example there were sessions on biology and cognition which I might otherwise have been interested in sitting in on, with titles like “Biology: Evolution, Genomes and Biochemistry”, “Exploring the Complementarity between Economics and Recent Evolutionary Theory”, “Cognitive Sciences and Neuroscience”, and “Methodological Issues in Cognitive Neuroscience”.  And, of course, more fundamental philosophy of science topics like “Fictions and Scientific Realism” and “Kinds: Chemical, Biological and Social”, as well as socially-oriented ones such as “Philosophy of Commercialized Science” and “Improving Peer Review in the Sciences”.  However, interesting as these are, one can’t do everything.

In some ways, this was a really great confluence of interests for me – physics and category theory, as seen through a philosophical lens.  I don’t know exactly how this session came about, but Hans Halvorson is a philosopher of science who started out in physics (and has now, for example, learned enough category theory to teach the course in it offered at Princeton), and Steve Awodey is a philosopher of mathematics who is interested in category theory in its own right.  They managed to get this session brought in to present some of the various ideas about the overlap between category theory and physics to an audience mostly consisting of philosophers, which seems like a good idea.  It was also interesting for me to get a view into how philosophers approach these subjects – what kind of questions they ask, how they argue, and so on.  As with any well-developed subject, there’s a certain amount of jargon and received ideas that people can refer to – for example, I learned the word and current usage (though not the basic concept) of supervenience, which came up, oh, maybe 5-10 times each day.

There are now a reasonable number of people bringing categorical tools to bear on physics – especially quantum physics.  What people who think about the philosophy of science can bring to this research is the usual: careful, clear thinking about the fundamental concepts involved in a way that tries not to get distracted by the technicalities and keep the focus on what is important to the question at hand in a deep way.  In this case, the question at hand is physics.  Philosophy doesn’t always accomplish this, of course, and sometimes get sidetracked by what some might call “pseudoquestions” – the kind of questions that tend to arise when you use some folk-theory or simple intuitive understanding of some subtler concept that is much better expressed in mathematics.  This is why anyone who’s really interested in the philosophy of science needs to learn a lot about science in its own terms.  On the whole, this is what they actually do.

And, of course, both mathematicians and physicists try to do this kind of thinking themselves, but in those fields it’s easy – and important! – to spend a lot of time thinking about some technical question, or doing extensive computations, or working out the fiddly details of a proof, and so forth.  This is the real substance of the work in those fields – but sometimes the bigger “why” questions, that address what it means or how to interpret the results, get glossed over, or answered on the basis of some superficial analogy.  Mind you – one often can’t really assess how a line of research is working out until you’ve been doing the technical stuff for a while.  Then the problem is that people who do such thinking professionally – philosophers – are at a loss to understand the material because it’s recent and technical.  This is maybe why technical proficiency in science has tended to run ahead of real understanding – people still debate what quantum mechanics “means”, even though we can use it competently enough to build computers, nuclear reactors, interferometers, and so forth.

Anyway – as for the substance of the talks…  In our session, since every speaker was a mathematician in some form, they tended to be more technical.  You can check out the slides linked to above for more details, but basically, four views of how to draw on category theory to talk about physics were represented.  I’ve actually discussed each of them in previous posts, but in summary:

• Bob Coecke, on “Quantum Picturalism”, was addressing the monoidal dagger-category point of view, which looks at describing quantum mechanical operations (generally understood to be happening in a category of Hilbert spaces) purely in terms of the structure of that category, which one can see as a language for handling a particular kind of logic.  Monoidal categories, as Peter Selinger as painstakingly documented, can be described using various graphical calculi (essentially, certain categories whose morphisms are variously-decorated “strands”, considered invariant under various kinds of topological moves, are the free monoidal categories with various structures – so anything you can prove using these diagrams is automatically true for any example of such categories).  Selinger has also shown that, for the physically interesting case of dagger-compact closed monoidal categories, a theorem is true in general if and only if it’s true for (finite dimensional) Hilbert spaces, which may account for why Hilbert spaces play such a big role in quantum mechanics.  This program is based on describing as much of quantum mechanics as possible in terms of this kind of diagrammatic language.  This stuff has, in some ways, been explored more through the lens of computer science than physics per se – certainly Selinger is coming from that background.  There’s also more on this connection in the “Rosetta Stone” paper by John Baez and Mike Stay,
• My talk (actually third, but I put it here for logical flow) fits this framework, more or less.  I was in some sense there representing a viewpoint whose current form is due to Baez and Dolan, namely “groupoidification”.  The point is to treat the category $Span(Gpd)$ as a “categorification” of (finite dimensional) Hilbert spaces in the sense that there is a representation map $D : Span(Gpd) \rightarrow Hilb$ so that phenomena living in $Hilb$ can be explained as the image of phenomena in $Span(Gpd)$.  Having done that, there is also a representation of $Span(Gpd)$ into 2-Hilbert spaces, which shows up more detail (much more, at the object level, since Tannaka-Krein reconstruction means that the monoidal 2-Hilbert space of representations of a groupoid is, at least in nice cases, enough to completely reconstruct it).  This gives structures in $2Hilb$ which “conceptually” categorify the structures in $Hilb$, and are also directly connected to specific Hilbert spaces and maps, even though taking equivalence classes in $2Hilb$ definitely doesn’t produce these.  A “state” in a 2-Hilbert space is an irreducible representation, though – so there’s a conceptual difference between what “state” means in categorified and standard settings.  (There’s a bit more discussion in my notes for the talk than in the slides above.)
• Klaas Landsman was talking about what he calls “Bohrification“, which, on the technical side, makes use of Topos theory.  The philosophical point comes from Niels Bohr’s “doctrine of classical concepts” – that one should understand quantum systems using concepts from the classical world.  In practice, this means taking a (noncommutative) von Neumann algebra $A$ which describes the observables a quantum system and looking at it via its commutative subalgebras.  These are organized into a lattice – in fact, a site.  The idea is that the spectrum of $A$ lives in the topos associated to this site: it’s a presheaf that, over each commutative subalgebra $C \subset A$, just gives the spectrum of $C$.  This is philosophically nice in that the “Bohrified” propositions actually behave in a logically sensible way.  The topos approach comes from Chris Isham, developed further with Andreas Doring. (Note the series of four papers by both from 2007.  Their approach is in some sense dual to that of Lansman, Heunen and Spitters, in the sense that they look at the same site, but look at dual toposes – one of sheaves, the other of cosheaves.  The key bit of jargon in Isham and Doring’s approach is “daseinization”, which is a reference to Heidegger’s “Being and Time”.  For some reason this makes me imagine Bohr and Heidegger in a room, one standing on the ceiling, one on the floor, disputing which is which.)
• Gonzalo Reyes talked about synthetic differential geometry (SDG) as a setting for building general relativity.  SDG is a way of doing differential geometry in a category where infinitesimals are actually available, that is, there is a nontrivial set $D = \{ x \in \mathbb{R} | x^2 = 0 \}$.  This simplifies discussions of vector fields (tangent vectors will just be infinitesimal vectors in spacetime).  A vector field is really a first order DE (and an integral curve tangent to it is a solution), so it’s useful to have, in SDG, the fact that any differentiable curve is, literally, infinitesimally a line.  Then the point is that while the gravitational “field” is a second-order DE, so not a field in this sense, the arguments for GR can be reproduced nicely in SDG by talking about infinitesimally-close families of curves following geodesics.  Gonzalo’s slides are brief by necessity, but happily, more details of this are in his paper on the subject.

The other sessions I went to were mostly given by philosophers, rather than physicists or mathematicians, though with exceptions.  I’ll briefly present my own biased and personal highlights of what I attended.  They included sessions titled:

Quantum Physics“: Edward Slowik talked about the “prehistory of quantum gravity”, basically revisiting the debate between Newton and Leibniz on absolute versus relational space, suggesting that Leibniz’ view of space as a classification of the relation of his “monads” is more in line with relational theories such as spin foams etc.  M. Silberstein and W. Stuckey – gave a talk about their “relational blockworld” (described here) which talks about QFT as an approximation to a certain discrete theory, built on a graph, where the nodes of the graph are spacetime events, and using an action functional on the graph.

Meinard Kuhlmann gave an interesting talk about “trope bundles” and AQFTTrope ontology is an approach to “entities” that doesn’t assume there’s a split between “substrates” (which have no properties themselves), and “properties” which they carry around.  (A view of ontology that goes back at least to Aristotle’s “substance” and “accident” distinction, and maybe further for all I know).  Instead, this is a “one-category” ontology – the basic things in this ontology are “tropes”, which he defined as “individual property instances” (i.e. as opposed to abstract properties that happen to have instances).  “Things” then, are just collections of tropes.  To talk about the “identity” of a thing means to pick out certain of the tropes as the core ones that define that thing, and others as peripheral.  This struck me initially as a sort of misleading distinction we impose (say, “a sphere” has a core trope of its radial symmetry, and incidental tropes like its colour – but surely the way of picking the object out of the world is human-imposed), until he gave the example from AQFT.  To make a long story short, in this setup, the key entites are something like elementary particles, and the core tropes are those properties that define an irreducible representation of a $C^{\star}$-algebra (things like mass, spin, charge, etc.), whereas the non-core tropes are those that identify a state vector within such a representation: the attributes of the particle that change over time.

I’m not totally convinced by the “trope” part of this (surely there are lots of choices of the properties which determine a representation, but I don’t see the need to give those properties the burden of being the only ontologically primaries), but I also happen to like the conclusions because in the 2Hilbert picture, irreducible representations are states in a 2-Hilbert space, which are best thought of as morphisms, and the state vectors in their components are best thought of in terms of 2-morphisms.  An interpretation of that setup says that the 1-morphism states define which system one’s talking about, and the 2-morphism states describe what it’s doing.

New Directions Concerning Quantum Indistinguishability“: I only caught a couple of the talks in this session, notably missing Nick Huggett’s “Expanding the Horizons of Quantum Statistical Mechanics”.  There were talks by John Earman (“The Concept of Indistinguishable Particles in Quantum
Mechanics”), and by Adam Caulton (based on work with Jeremy Butterfield) on “On the Physical Content of the Indistinguishability Postulate”.  These are all about the idea of indistinguishable particles, and the statistics thereof.  Conventionally, in QM you only talk about bosons and fermions – one way to say what this means is that the permutation group $S_n$ naturally acts on a system of $n$ particles, and it acts either trivially (not altering the state vector at all), or by sign (each swap of two particles multiplies the state vector by a minus sign).  This amounts to saying that only one-dimensional representations of $S_n$ occur.  It is usually justified by the “spin-statistics theorem“, relating it to the fact that particles have either integer or half-integer spins (classifying representations of the rotation group).  But there are other representations of $S_n$, labelled by Young diagrams, though they are more than one-dimensional.  This gives rise to “paraparticle” statistics.  On the other hand, permuting particles in two dimensions is not homotopically trivial, so one ought to use the braid group $B_n$, rather than $S_n$, and this gives rise again to different statistics, called “anyonic” statistics.

One recurring idea is that, to deal with paraparticle statistics, one needs to change the formalism of QM a bit, and expand the idea of a “state vector” (or rather, ray) to a “generalized ray” which has more dimensions – corresponding to the dimension of the representation of $S_n$ one wants the particles to have.  Anyons can be dealt with a little more conventionally, since a 2D system may already have them.  Adam Caulton’s talk described how this can be seen as a topological phenomenon or a dynamical one – making an analogy with the Bohm-Aharonov effect, where the holonomy of an EM field around a solenoid can be described either dynamically with an interacting Lagrangian on flat space, or topologically with a free Lagrangian in space where the solenoid has been removed.

Quantum Mechanics“: A talk by Elias Okon and Craig Callender about QM and the Equivalence Principle, based on this.  There has been some discussion recently as to whether quantum mechanics is compatible with the principle that relates gravitational and inertial mass.  They point out that there are several versions of this principle, and that although QM is incompatible with some versions, these aren’t the versions that actually produce general relativity.  (For example, objects with large and small masses fall differently in quantum physics, because though the mean travel time is the same, the variance is different.  But this is not a problem for GR, which only demands that all matter responds dynamically to the same metric.)  Also, talks by Peter Lewis on problems with the so-called “transactional interpretation” of QM, and Bryan Roberts on time-reversal.

Why I Care About What I Don’t Yet Know“:  A funny name for a session about time-asymmetry, which is the essentially philosophical problem of why, if the laws of physics are time-symmetric (which they approximately are for most purposes), what we actually experience isn’t.  Personally, the best philosophical account of this I’ve read is Huw Price’s “Time’s Arrow“, though Reichenbach’s “The Direction of Time” has good stuff in it also, and there’s also Zeh’s more technical “The Physical Basis of the Direction of Time“. In the session, Chris Suhler and Craig Callender gave an account of how, given causal asymmetry, our subjective asymmetry of values for the future and the past can arise (the intuitively obvious point being that if we can influence the future and not the past, we tend to value it more).  Mathias Frisch talked about radiation asymmetry (the fact that it’s equally possible in EM to have waves converging on a source than spreading out from it, yet we don’t see this).  Owen Maroney argued that “There’s No Route from Thermodynamics to the Information Asymmetry” by describing in principle how to construct a time-reversed (probabilisitic) computer.  David Wallace spoke on “The Logic of the Past Hypothesis”, the idea inspired by Boltzmann that we see time-asymmetry because there is a point in what we call the “past” where entropy was very low, and so we perceive the direction away from that state as “forward” it time because the world tends to move toward equilibrium (though he pointed out that for dynamical reasons, the world can easily stay far away from equilibrium for a long time).  He went on to discuss the logic of this argument, and the idea of a “simple” (i.e. easy-to-describe) distribution, and the conjecture that the evolution of these will generally be describable in terms of an evolution that uses “coarse graining” (i.e. that repeatedly throws away microscopic information).

The Emergence of Spacetime in Quantum Theories of Gravity“:  This session addressed the idea that spacetime (or in some cases, just space) might not be fundamental, but could emerge from a more basic theory.  Christian Wüthrich spoke about “A-Priori versus A-Posteriori” versions of this idea, mostly focusing on ideas such as LQG and causal sets, which start with discrete structures, and get manifolds as approximations to them.  Nick Huggett gave an overview of noncommutative geometry for the philosophically minded audience, explaining how an algebra of observables can be treated like space by means of all the concepts from geometry which can be imported into the theory of $C^{\star}$-algebras, where space would be an approximate description of the algebra by letting the noncommutativity drop out of sight in some limit (which would be described as a “large scale” limit).  Sean Carroll discussed the possibility that “Space is Not Fundamental – But Time Might Be”, pointing out that even in classical mechanics, space is not a fundamental notion (since it’s possible to reformulate even Hamiltonian classical mechanics without making essential distinctions between position and momentum coordinates), and suggesting that space arises from the dynamics of an actual physical system – a Hamiltonian, in this example – by the principle “Position Is The Thing In Which Interactions Are Local”.  Finally, Sean Maudlin gave an argument for the fundamentality of time by showing how to reconstruct topology in space from a “linear structure” on points saying what a (directed!) path among the points is.

I say this is about a “recent” talk, though of course it was last year… But to catch up: Ivan Dynov was visiting from York and gave a series of talks, mainly to the noncommutative geometry group here at UWO, about the problem of classifying von Neumann algebras. (Strictly speaking, since there is not yet a complete set of invariants for von Neumann algebras known, one could dispute the following is a “classification”, but here it is anyway).

The first point is that any von Neumann algebra $\mathcal{A}$ is a direct integral of factors, which are highly noncommutative in that the centre of a factor consists of just the multiples of the identity. The factors are the irreducible building blocks of the noncommutative features of $\mathcal{A}$.

There are two basic tools that provide what classification we have for von Neumann algebras: first, the order theory for projections; second, the Tomita-Takesaki theory. I’ve mentioned the Tomita flow previously, but as for the first part:

A projection (self-adjoint idempotent) is just what it sounds like, if you reprpsent $\mathcal{M}$ as an algebra of bounded operators on a Hilbert space. An extremal but informative case is $\mathcal{M} = \mathcal{B}(H)$, but in general not every bounded operator appears in $\mathcal{M}$.

In the case where $\mathcal{M} = \mathcal{B}(H)$, then a projection in $\mathcal{M}$ is the same thing as a subspace of $H$. There is an (orthomodular) lattice of them (in general, the lattice of projections is $\mathcal{P(M)}$). For subspaces, the dimension characterizes $H$ up to isomorphism – any any two subspaces of the same dimension are isomorphic by some operator in $\mathcal{B}(H)$ (but not necessarily in a general $\mathcal{M}$).

The idea is to generalize this to projections in a general $\mathcal{A}$, and get some characterization of $\mathcal{A}$. The kind of isomorphism that matters for subspaces is a partial isometry – a map $u$ which preserves the metric on some subspace, and otherwise acts as a projection. In fact, the corresponding projections are then conjugate by $u$. So we define, for a general $\mathcal{M}$, an equivalence relation on projections, which amounts to saying that $e \sim f$ if there’s a partial isometry $u \in \mathcal{M}$ with $e = u*u$, and $f = uu*$ (i.e. the projections are conjugate by $u$).

Then there’s an order relation on the equivalence classes of projections – which, as suggested above, we should think of as generalizing “dimension” from the case $\mathcal{M} = \mathcal{B}(H)$. The order relation says that $e \leq f$ if $e \sim e_0$ where $e_0 \leq f$ as a projection (i.e. inclusion thinking of a projection as its image subspace of $H$). But the fact that $\mathcal{M}$ may not be all of $\mathcal{B}(H)$ has some counterintuitive consequences. For example, we can define a projection $e \in \mathcal{M}$ to be finite if the only time $e \sim e_0 \leq e$ is when $e_0 = e$ (which is just the usual definition of finite, relativized to use only maps in $\mathcal{M}$). We can call $e \in \mathcal{M}$ a minimal projection if it is nonzero and $f \leq e$ imples $f = e$ or $f = 0$.

Then the first pass at a classification of factors (i.e. “irreducible” von Neumann algebras) says a factor $\mathcal{M}$ is:

• Type $I$: If $\mathcal{M}$ contains a minimal projection
• Type $II$: If $\mathcal{M}$ contains no minimal projection, but contains a (nontrivial) finite projection
• Type $III$: If $\mathcal{M}$ contains no minimal or nontrivial finite projection

We can further subdivide them by following the “dimension-function” analogy, which captures the ordering of projections for $\mathcal{M} = \mathcal{B}(H)$, since it’s a theorem that there will be a function $d : \mathcal{P(M)} \rightarrow [0,\infty]$ which has the properties of “dimension” in that it gets along with the equivalence relation $\sim$, respects finiteness, and “dimension” of direct sums. Then letting $D$ be the range of this function, we have a few types. There may be more than one function $d$, but every case has one of the types:

• Type $I_n$: When $D = \{0,1,\dots,n\}$ (That is, there is a maximal, finite projection)
• Type $I_\infty$: When $D = \{ 0, 1, \dots, \infty \}$ (If there is an infinite projection in $\mathcal{M}$
• Type $II_1$: When $D = [ 0 , 1 ]$ (The maximal projection is finite – such a case can always be rescaled so the maximum $d$ is $1$)
• Type $II_\infty$: When $D = [ 0 , \infty ]$ (The maximal projection is infinite – notice that this has the same order type as type $II_1$)
• Type $III_\infty$ \: When $D = [0,\infty]$ (An infinite maximal projection)
• Type $III$: $D = \{0,1\}$, (these are called properly infinite)

The type $I$ case are all just (equivalent to) matrix algebras on some countable or finite dimensional vector space – which we can think of as a function space like $l_2(X)$ for some set $X$. Types $II$ and $III$ are more interesting. Type $II$ algebras are related to what von Neumann called “continuous geometries” – analogs of projective geometry (i.e. geometry of subspaces), with a continuous dimension function.

(If we think of these algebras $\mathcal{M}$ as represented on a Hilbert space $H$, then in fact, thought of as subspaces of $H$, all the projections give infinite dimensional subspaces. But since the definition of “finite” is relative to $\mathcal{M}$, and any partial isometry from a subspace $H' \leq H$ to a proper subspace $H'' < H'$ of itself that may exist in $\mathcal{B}(H)$ is not in $M$.)

In any case, this doesn’t exhaust what we know about factors. In his presentation, Ivan Dynov described some examples constructed from crossed products of algebras, which is important later, but for the moment, I’ll finish describing another invariant which helps pick apart the type $III$ factors. This is related to Tomita-Takesaki theory, which I’ve mentioned in here before.

You’ll recall that the Tomita flow (associated to a given state $\phi$) is given by $\sigma^{\phi}_t(A) = e^{i \Delta t} A e^{-i \Delta t}$, where $\Delta$ is the self-adjoint part of the conjugation operator $S$ (which depends on the state $\phi$ because it refers to the GNS representation of $\mathcal{M}$ on a Hilbert space $H$). This flow is uninteresting for Type $I$ or $II$ factors, but for type $III$ factors, it’s the basis of Connes’ classification.

In particular, the we can understand the Tomita flow in terms of eigenvalues of $\Delta$, since it comes from exponentials of $\Delta$. Moreover, as I commented last time, the really interesting part of the flow is independent of which state we pick. So we are interested in the common eigenvalues of the $\Delta$ associated to different states $\phi$, and define

$S(\mathcal{M}) = \cap_{\phi \in W} Spec(\Delta_{\phi})$

(where $W$ is the set of all states on $\mathcal{M}$, or actually “weights”)

Then $S(\mathcal{M}) - \{ 0 \}$, it turns out, is always a multiplicative subgroup of the positive real line, and the possible cases refine to these:

• $S(\mathcal{M}) = \{ 1 \}$ : This is when $\mathcal{M}$ is type $I$ or $II$
• $S(\mathcal{M}) = [0, \infty )$ : Type $III_1$
• $S(\mathcal{M}) = \{ 0 \} \cup \{ \lambda^n : n \in \mathbb{Z}, 0 < \lambda < 1 \}$ : Type $III_{\lambda}$ (for each $\lambda$ in the range $(0,1)$, and
• $S(\mathcal{M}) = \{ 0 , 1 \}$ : Type $III_0$

(Taking logarithms, $S(\mathcal{M}) - \{ 0 \}$ gives an additive subgroup of $\mathbb{R}$, $\Gamma(\mathcal{M})$ which gives the same information). So roughly, the three types are: $I$ finite and countable matrix algebras, where the dimension function tells everything; $II$ where the dimension function behaves surprisingly (thought of as analogous to projective geometry); and $III$, where dimensions become infinite but a “time flow” dimension comes into play.  The spectra of $\Delta$ above tell us about how observables change in time by the Tomita flow:  high eigenvalues cause the observable’s value to change faster with time, while low ones change slower.  Thus the spectra describe the possible arrangements of these eigenvalues: apart from the two finite cases, the types are thus a continuous positive spectrum, and a discrete one with a single generator.  (I think of free and bound energy spectra, for an analogy – I’m not familiar enough with this stuff to be sure it’s the right one).

This role for time flow is interesting because of the procedures for constructing examples of type $III$, which Ivan Dynov also described to us. These are examples associated with dynamical systems. These show up as crossed products. See the link for details, but roughly this is a “product” of an algebra by a group action – a kind of von Neumann algebra equivalent of the semidirect product of groups $H \rtimes K$ incorporating an action of $K$ on $H$. Indeed, if a (locally compact) group $K$ acts on group $H$ then the crossed product of algebras is just the von Neumann algebra of the semidirect product group.

In general, a ($W*$)-dynamical system is $(\mathcal{M},G,\alpha)$, where $G$ is a locally compact group acting by automorphisms on the von Neumann algebra $\mathcal{M}$, by the map $\alpha : G \rightarrow Aut(\mathcal{M})$. Then the crossed product $\mathcal{M} \rtimes_{\alpha} G$ is the algebra for the dynamical system.

A significant part of the talks (which I won’t cover here in detail) described how to use some examples of these to construct particular type $III$ factors. In particular, a theorem of Murray and von Neumann says $\mathcal{M} = L^{\infty}(X,\mu) \rtimes_{\alpha} G$ is a factor if the action of discrete group $G$ on a finite measure space $X$ is ergodic (i.e. has no nontrivial proper invariant sets – roughly, each orbit is dense). Another says this factor is type $III$ unless there’s a measure equivalent to (i.e. absolutely continuous with) $\mu$, and which is equivariant. Some clever examples I won’t reconstruct gave some factors like this explicitly.

He concluded by talking about some efforts to improve the classification: the above is not a complete set of invariants, so a lot of work in this area is improving the completeness of the set. One set of results he told us about do this somewhat for the case of hyperfinite factors (i.e. ones which are limits of finite ones), namely that if they are type $III$, they are crossed products of with a discrete group.

At any rate, these constructions are interesting, but it would take more time than I have here to look in detail – perhaps another time.

When I made my previous two posts about ideas of “state”, one thing I was aiming at was to say something about the relationships between states and dynamics. The point here is that, although the idea of “state” is that it is intrinsically something like a snapshot capturing how things are at one instant in “time” (whatever that is), extrinsically, there’s more to the story. The “kinematics” of a physical theory consists of its collection of possible states. The “dynamics” consists of the regularities in how states change with time. Part of the point here is that these aren’t totally separate.

Just for one thing, in classical mechanics, the “state” includes time-derivatives of the quantities you know, and the dynamical laws tell you something about the second derivatives. This is true in both the Hamiltonian and Lagrangian formalism of dynamics. The Hamiltonian function, which represents the concept of “energy” in the context of a system, is based on a function $H(q,p)$, where $q$ is a vector representing the values of some collection of variables describing the system (generalized position variables, in some configuration space $X$), and the $p = m \dot{q}$ are corresponding “momentum” variables, which are the other coordinates in a phase space which in simple cases is just the cotangent bundle $T*X$. Here, $m$ refers to mass, or some equivalent. The familiar case of a moving point particle has “energy = kinetic + potential”, or $H = p^2 / m + V(q)$ for some potential function $V$. The symplectic form on $T*X$ can then be used to define a path through any point, which describes the evolution of the system in time – notably, it conserves the energy $H$. Then there’s the Lagrangian, which defines the “action” associated to a path, which comes from integrating some function $L(q, \dot{q})$ living on the tangent bundle $TX$, over the path. The physically realized paths (classically) are critical points of the action, with respect to variations of the path.

This is all based on the view of a “state” as an element of a set (which happens to be a symplectic manifold like $T*X$ or just a manifold if it’s $TX$), and both the “energy” and the “action” are some kind of function on this set. A little extra structure (symplectic form, or measure on path space) turns these functions into a notion of dynamics. Now a function on the space of states is what an observable is: energy certainly is easy to envision this way, and action (though harder to define intuitively) counts as well.

But another view of states which I mentioned in that first post is the one that pertains to statistical mechanics, in which a state is actually a statisticial distribution on the set of “pure” states. This is rather like a function – it’s slightly more general, since a distribution can have point-masses, but any function gives a distribution if there’s a fixed measure $d\mu$ around to integrate against – then a function like $H$ becomes the measure $H d\mu$. And this is where the notion of a Gibbs state comes from, though it’s slightly trickier. The idea is that the Gibbs state (in some circumstances called the Boltzmann distribution) is the state a system will end up in if it’s allowed to “thermalize” – it’s the maximum-entropy distribution for a given amount of energy in the specified system, at a given temperature $T$. So, for instance, for a gas in a box, this describes how, at a given temperature, the kinetic energies of the particles are (probably) distributed. Up to a bunch of constants of proportionality, one expects that the weight given to a state (or region in state space) is just $exp(-H/T)$, where $H$ is the Hamiltonian (energy) for that state. That is, the likelihood of being in a state is inversely proportional to the exponential of its energy – and higher temperature makes higher energy states more likely.

Now part of the point here is that, if you know the Gibbs state at temperature $T$, you can work out the Hamiltonian
just by taking a logarithm – so specifying a Hamiltonian and specifying the corresponding Gibbs state are completely equivalent. But specifying a Hamiltonian (given some other structure) completely determines the dynamics of the system.

This is the classical version of the idea Carlo Rovelli calls “Thermal Time”, which I first encountered in his book “Quantum Gravity”, but also is summarized in Rovelli’s FQXi essay “Forget Time“, and described in more detail in this paper by Rovelli and Alain Connes. Mathematically, this involves the Tomita flow on von Neumann algebras (which Connes used to great effect in his work on the classification of same). It was reading “Forget Time” which originally got me thinking about making the series of posts about different notions of state.

Physically, remember, these are von Neumann algebras of operators on a quantum system, the self-adjoint ones being observables; states are linear functionals on such algebras. The equivalent of a Gibbs state – a thermal equilibrium state – is called a KMS (Kubo-Martin-Schwinger) state (for a particular Hamiltonian). It’s important that the KMS state depends on the Hamiltonian, which is to say the dynamics and the notion of time with respect to which the system will evolve. Given a notion of time flow, there is a notion of KMS state.

One interesting place where KMS states come up is in (general) relativistic thermodynamics. In particular, the effect called the Unruh Effect is an example (here I’m referencing Robert Wald’s book, “Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics”). Physically, the Unruh effect says the following. Suppose you’re in flat spacetime (described by Minkowski space), and an inertial (unaccelerated) observer sees it in a vacuum. Then an accelerated observer will see space as full of a bath of particles at some temperature related to the acceleration. Mathematically, a change of coordinates (acceleration) implies there’s a one-parameter family of automorphisms of the von Neumann algebra which describes the quantum field for particles. There’s also a (trivial) family for the unaccelerated observer, since the coordinate system is not changing. The Unruh effect in this language is the fact that a vacuum state relative to the time-flow for an unaccelerated observer is a KMS state relative to the time-flow for the accelerated observer (at some temperature related to the acceleration).

The KMS state for a von Neumann algebra with a given Hamiltonian operator has a density matrix $\omega$, which is again, up to some constant factors, just the exponential of the Hamiltonian operator. (For pure states, $\omega = |\Psi \rangle \langle \Psi |$, and in general a matrix becomes a state by $\omega(A) = Tr(A \omega)$ which for pure states is just the usual expectation value value for A, $\langle \Psi | A | \Psi \rangle$).

Now, things are a bit more complicated in the von Neumann algebra picture than the classical picture, but Tomita-Takesaki theory tells us that as in the classical world, the correspondence between dynamics and KMS states goes both ways: there is a flow – the Tomita flow – associated to any given state, with respect to which the state is a KMS state. By “flow” here, I mean a one-parameter family of automorphisms of the von Neumann algebra. In the Heisenberg formalism for quantum mechanics, this is just what time is (i.e. states remain the same, but the algebra of observables is deformed with time). The way you find it is as follows (and why this is right involves some operator algebra I find a bit mysterious):

First, get the algebra $\mathcal{A}$ acting on a Hilbert space $H$, with a cyclic vector $\Psi$ (i.e. such that $\mathcal{A} \Psi$ is dense in $H$ – one way to get this is by the GNS representation, so that the state $\omega$ just acts on an operator $A$ by the expectation value at $\Psi$, as above, so that the vector $\Psi$ is standing in, in the Hilbert space picture, for the state $\omega$). Then one can define an operator $S$ by the fact that, for any $A \in \mathcal{A}$, one has

$(SA)\Psi = A^{\star}\Psi$

That is, $S$ acts like the conjugation operation on operators at $\Psi$, which is enough to define $S$ since $\Psi$ is cyclic. This $S$ has a polar decomposition (analogous for operators to the polar form for complex numbers) of $S = J \Delta$, where $J$ is antiunitary (this is conjugation, after all) and $\Delta$ is self-adjoint. We need the self-adjoint part, because the Tomita flow is a one-parameter family of automorphisms given by:

$\alpha_t(A) = \Delta^{-it} A \Delta^{it}$

An important fact for Connes’ classification of von Neumann algebras is that the Tomita flow is basically unique – that is, it’s unique up to an inner automorphism (i.e. a conjugation by some unitary operator – so in particular, if we’re talking about a relativistic physical theory, a change of coordinates giving a different $t$ parameter would be an example). So while there are different flows, they’re all “essentially” the same. There’s a unique notion of time flow if we reduce the algebra $\mathcal{A}$ to its cosets modulo inner automorphism. Now, in some cases, the Tomita flow consists entirely of inner automorphisms, and this reduction makes it disappear entirely (this happens in the finite-dimensional case, for instance). But in the general case this doesn’t happen, and the Connes-Rovelli paper summarizes this by saying that von Neumann algebras are “intrinsically dynamic objects”. So this is one interesting thing about the quantum view of states: there is a somewhat canonical notion of dynamics present just by virtue of the way states are described. In the classical world, this isn’t the case.

Now, Rovelli’s “Thermal Time” hypothesis is, basically, that the notion of time is a state-dependent one: instead of an independent variable, with respect to which other variables change, quantum mechanics (per Rovelli) makes predictions about correlations between different observed variables. More precisely, the hypothesis is that, given that we observe the world in some state, the right notion of time should just be the Tomita flow for that state. They claim that checking this for certain cosmological models, like the Friedman model, they get the usual notion of time flow. I have to admit, I have trouble grokking this idea as fundamental physics, because it seems like it’s implying that the universe (or any system in it we look at) is always, a priori, in thermal equilibrium, which seems wrong to me since it evidently isn’t. The Friedman model does assume an expanding universe in thermal equilibrium, but clearly we’re not in exactly that world. On the other hand, the Tomita flow is definitely there in the von Neumann algebra view of quantum mechanics and states, so possibly I’m misinterpreting the nature of the claim. Also, as applied to quantum gravity, a “state” perhaps should be read as a state for the whole spacetime geometry of the universe – which is presumably static – and then the apparent “time change” would then be a result of the Tomita flow on operators describing actual physical observables. But on this view, I’m not sure how to understand “thermal equilibrium”.  So in the end, I don’t really know how to take the “Thermal Time Hypothesis” as physics.

In any case, the idea that the right notion of time should be state-dependent does make some intuitive sense. The only physically, empirically accessible referent for time is “what a clock measures”: in other words, there is some chosen system which we refer to whenever we say we’re “measuring time”. Different choices of system (that is, different clocks) will give different readings even if they happen to be moving together in an inertial frame – atomic clocks sitting side by side will still gradually drift out of sync. Even if “the system” means the whole universe, or just the gravitational field, clearly the notion of time even in General Relativity depends on the state of this system. If there is a non-state-dependent “god’s-eye view” of which variable is time, we don’t have empirical access to it. So while I can’t really assess this idea confidently, it does seem to be getting at something important.

So as I mentioned in my previous post, I attended 80% of the conference “Categories, Quanta, Concepts”, hosted by the Perimeter Institute.  Videos of many of the talks are online, but on the assumption that not everyone will watch them all, I’ll comment anyway…😉

It dealt with various takes on the uses of category theory in fundamental physics, and quantum physics particularly. One basic theme is that the language of categories can organize and clarify the concepts that show up here. Since there doesn’t seem to be a really universal agreement on what “fundamental” physics is, or what the concepts involved might be, this is probably a good thing.

There were a lot of talks, so I’ll split this into a couple of posts – this first one dealing with two obvious category-related themes – monoidal categories and toposes.  The next post will cover most of the others – roughly, focused on fundamentals of quantum mechanics, and on categories for logic and language.

Monoidal Categories

So a large contingent came from Oxford’s Comlab, many of them looking at ideas that I first saw popularized by Abramsky and Coecke about describing the features of quantum mechanics that appear in any dagger-compact category. This yields a “string diagram” notation for quantum systems. (An explanation of this system is given by Abramsky and Coecke – http://arxiv.org/abs/0808.1023 – or more concisely by Coecke – http://arxiv.org/abs/quant-ph/0510032).

Samson Abramsky talked about diagonal arguments. This is a broad class of arguments including Cantor’s theorem (that the real line is uncountable), Russell’s paradox in set theory (about the “set” of non-self-membered sets), Godel’s incompleteness theorem, and others. Abramsky’s talk was based on Bill Lawvere’s analysis of these arguments in general cartesian closed categories (CCC’s). The relevance to quantum theory has to do with “no-cloning” theorems – that quantum states can’t be duplicated. Diagonal arguments involve two capabilitiess: the ability to duplicate objects, and the ability to represent predicates (think of Godel numbering, for instance) which is related to a fixed point property. Generalizing to other monoidal categories, one still has representability: linear functionals on Hilbert spaces can be represented by vectors. But diagonal arguments fail since there is no diagonal $\Delta : H \rightarrow H \otimes H$.

Bob Coecke and Ross Duncan both spoke about “complementary observables”. Part of this comes from their notion of an “observable structure”, or “classical structure” for a quantum system. The intuition here is that this is some collection of observables which we can simultaneously observe, and such that, if we restrict to those observables, and states which are eigenstates for them, we can treat the whole system as if it were classical. In particular, this gives us “copy” and “destroy” operations for states – these maps and their duals actually turn out to define a Frobenius algebra. In finite-dimensional Hilbert spaces, this is equivalent to choosing an orthonormal basis.

Complementary observables is related to the concept of mutually unbiased bases. So the bases $\{v_i\}$ and $\{w_j\}$ are unbiased if all the inner products $\langle v_i , w_j \rangle$ have the same magnitude. If these bases are associated to observables (say, they form a basis of eigenvectors), then knowing a classical value of one observable gives no information about the other – all eigenstates are equally likely. For a visual image, think of two sets of bases for the plane, rotated 45 degrees relative to each other. Each basis vector in one has a projection of equal length onto both basis vectors of the other.

Thinking of the orthonormal bases as “observable structures”, the mutually unbiased ones correspond to “complementary” observables: a state which is classical for one observable (i.e. is an eigenstate for that operator) is unbiased (i.e. has equal probablities of having any value) for the other observable. Labelling the different structures with colours (red and green, usually), they could diagrammatically represent states being classical or unbiased in particular systems.

This is where “phase groups” come into play. The setup is that we’re given some system – the toy model they often referred to was a spinning particle in 3D – and an observable system (say, just containing the observable “spin in the X direction”). Then there’s a group of symmetries of the system which leave that observable untouched (in that example, the symmetries are rotation about the X axis). This is the “phase group” for that observable.

Bill Edwards talked about phase groups and how they can be used to classify systems. He gave an example of a couple of toy models with six states each. One was based on spin (the six states describe spins about each axis in 3-space in each direction). The other, due to Robert Spekkens, is a “hidden variable” theory, where there are four possible “ontic” states (the “hidden” variable), but the six “epistemic” states only register whether the state lies in of six possible PAIRS of ontic states. The two toy models resemble each other at the level of states, but the phase groups are different: the truly “quantum” one has a cyclic group $\mathbb{Z}_4$ (for the X-spin observable, it’s generated by a right-angled rotation about the X axis); the “hidden variable” model, which has some quantum-mechanics-like features, but not all, has phase group $\mathbb{Z}_2 \times \mathbb{Z}_2$. The suggestion of the talk was that this phase group distinguishes “local” from “nonlocal” systems (i.e. ones with hidden variable models and ones without).

Marni Sheppard also gave a talk about Mutually Unbiased Bases, p-adic arithmetic, and algebraic geometry over finite fields, which I find hard to summarize because I don’t understand all those fields very well. Roughly, her talk made a link between quantum mechanics and an axiomatic version of projective geometry (Hilbert spaces in QM ought to be projective, after all, so this makes sense).  There was also a connection between mutually unbiased bases and finite fields, but again, this sort of escaped me.

Also in this group was Jamie Vicary, whom I’ve been working with on a project about the categorified harmonic oscillator.  His talk, however, was about n-Hilbert spaces, and n-categorical extended TQFT.  The basic point is that a TQFT assigns a number to a closed n-manifold, and a Hilbert space to each (n-1)-manifold (such as a boundary between two parts of a closed one), and if the TQFT is fully local (i.e. can be derived from, say, a triangulation), this can be continued to have it assign k-Hilbert spaces to (n-k)-manifolds for all k up to n.  He described the structure of 2-Hilbert spaces, and also monoidal ones (as many interesting cases are), and how they can all be realized (in finite dimensions, at least) as categories of representations of supergroupoids.  Part of the point of this talk was to suggest how not just dagger-compact categories, but general n-categories should be useful for quantum theory.

Toposes

The monoidal category setting is popular for dealing with quantum theories, since it abstracts some properties of Hilbert spaces, which they’re usually modelled in.  Topos theory is usually thought of as a generalization of the category of sets, and in particular they model intuitionistic classical, not quantum, logic.  So the talk by Andreas Döring (based on work with Christopher Isham – see many of Andreas’ recent papers) called “Why Topos Theory in the Foundations of Physics?” is surprising if you haven’t heard this idea before.  One motivation could be described in terms of the Kochen-Specker theorem, which, roughly, says that a quantum theory – involving observables which are operators on a Hilbert space of dimension at least three – can’t be modeled by a “state space”.  That is, it’s not the case that you can simultaneously give definite values to all the observables in a consistent way – in ANY state!  (That is, it’s not just the generic state: there is no state at all which corresponds to the classical picture of a “point” in some space parametrized by the observables.)

Now, part of the point is that there’s no “state space” in the category of sets – but maybe there is in some other topos!  And sure enough, the equivalent of a state space turns out to be a thing they call the “spectral presheaf” for the theory.  It’s an object in some topos.  The KS theorem becomes a statement that it has no “global points”.  To see what this means, you have to know what the spectral presheaf is.

This is based on the assumption that one has a (noncommutative) von Neumann algebra of operators on a Hilbert space – among them, the observables we might be interested in.  The structure of this algebra is supposed to describe some system.  Now you might want to look for subalgebras of it which are abelian.  Why?  Because a system of commuting operators, should they be observables, are ones which we CAN assign values to simultaneously – there’s no issue of which order we do measurements in.  Call this a “context” – a choice of subalgebra making the system look classical.  So maybe we can describe a “state space” in a context: so what?

Well, the collection of all such contexts forms a poset – in fact, lattice – in fact, a complete Heyting algebra.  These objects are just the same (object-wise) as “locales” (a generalization from topological spaces, and their lattice of open sets).  The topos in question is the category of presheaves on this locale, which is to say, of contravariant functors to Set.  Which is to say… a way of assigning a set (the “state space” I mentioned), with a way of restricting sets along inclusion maps.  This restriction can be a bit rough (in fact, the fact that restriction can be quite approximate is just where uncertainty principles and the like come from).  The main point is that this “spectral presheaf” (the assignment of local state spaces to each context) supports a concept of logic, for reasoning about the system it describes.  It’s a lot like the logic of sets, but operations happen “context-by-context”.  A proposition has a truth value which is a “downset” in the lattice of contexts – the collection of contexts where the proposition is true.  A proposition just amounts to a subobject of the spectral presheaf by what they call “daseinization” – it’s the equivalent of a proposition being a subset of a configuration space (where the statement is true).

One could say a lot more, but this is a blog post, after all.

There are philosophical issues that this subject seems to provoke – the sign of an interesting theory is that it gets people arguing, I suppose.  One is the characterization of this as a “neo-realist interpretation” of quantum theory.  A “naive realist” interpretation would be one that says a “state” is just a way of saying what all the values of all the observable quantities is – to put it another way, of giving definite truth values to all definite “yes/no” questions.  This is just what the KS theorem says can’t happen.  The spectral presheaf is supposedly “neo-realist” because it does almost these things, but in an exotic topos (of presheaves on the locale of all classical contexts).  As you might expect, this is a bit of a head-scratcher.

I spent most of last week attending four of the five days of the workshop “Categories, Quanta, Concepts”, at the Perimeter Institute.  In the next few days I plan to write up many of the talks, but it was quite a lot.  For the moment, I’d like to do a little writeup on the talk I gave.  I wasn’t originally expecting to speak, but the organizers wanted the grad students and postdocs who weren’t talking in the scheduled sessions to give little talks.  So I gave a short version of this one which I gave in Ottawa but as a blackboard talk, so I have no slides for it.

Now, the workshop had about ten people from Oxford’s Comlab visiting, including Samson Abramsky and Bob Coecke, Marni Sheppard, Jamie Vicary, and about half a dozen others.  Many folks in this group work in the context of dagger compact categories, which is a nice abstract setting that captures a lot of the features of the category $Hilb$ which are relevant to quantum mechanics.  Jamie Vicary had, earlier that day, given a talk about n-dimensional TQFT’s and n-categories – specifically, n-Hilbert spaces.  I’ll write up their talks in a later,  but it was a nice context in which to give the talk.

The point of this talk is to describe, briefly, $Span(Gpd)$ – as a category and as a 2-category; to explain why it’s a good conceptual setting for quantum theory; and to show how it bridges the gap between Hilbert spaces and 2-Hilbert spaces.

History and Symmetry

In the course of an afternoon discussion session, we were talking about the various approaches people are taking in fundamentals of quantum theory, and in trying to find a “quantum theory of gravity” (whatever that ends up meaning).  I raised a question about robust ideas: basically, it seems to me that if an idea shows up across many different domains, that’s probably a sign it belongs in a good theory.  I was hoping people knew of a number of these notions, because there are really only two I’ve seen in this light, and really there probably should be more.

The two physical  notions that motivate everything here are (1) symmetry, and (2) emphasis on histories.  Both ideas are applied to states: states have symmetries; histories link starting states to ending states.  Combining them suggests histories should have symmetries of their own, which ought to get along with the symmetries of the states they begin and end with.

Both concepts are rather fundamental. Hermann Weyl wrote a whole book, “Symmetry”, about the first, and wrote: As far as I can see, all a-priori statements in physics are based on symmetry. From diffeomorphism invariance in general relativity, to gauge symmetry in quantum field theory, to symmetric tensor products involved in Fock space, through classical examples like Noether’s theorem. Noether’s theorem is also about histories: it applies when a symmetry holds along an entire history of a system: in fact, Langrangian mechanics generally is all about histories, and how they’re selected to be “real” in a classical system (by having a critical value of the action functional). The Lagrangian point of view appears in quantum theory (and this was what Richard Feynman did in his thesis) as the famous “sum over histories”, or path integral. General relativity embraces histories as real – they’re spacetimes, which is what GR is all about. So these concepts seem to hold up rather well across different contexts.

I began by drawing this table:

 $Sets$ $Span(Sets) \rightarrow Rel$ $Grpd$ $Span(Grpd)$

The names are all those of categories. Moving left to right moves from a category describing collections of states, to one describing states-and-histories. It so happens that it also takes a cartesian category (or 2-category) to a symmetric monoidal one. Moving from top to bottom goes from a setting with no symmetry to one with symmetry. In both cases, the key concept is naturally expressed with a category, and shows up in morphisms. Now, since groupoids are already categories, both of the bottom entries properly ought to be 2-categories, but when we choose to, we can ignore that fact.

Why Spans?

I’ve written a bunch on spans here before, but to recap, a span in a category $C$ is a diagram like: $X \stackrel{s}{\leftarrow} H \stackrel{t}{\rightarrow} Y$. Say we’re in $Sets$, so all these objects are sets: we interpret $X$ and $Y$ as sets of states. Each one describes some system by collecting all its possible (“pure”) states. (To be better, we could start with a different base category – symplectic manifolds, say – and see if the rest of the analysis goes through). For now, we just realize that $H$ is a set of histories leading the system $X$ to the system $Y$ (notice there’s no assumption the system is the same). The maps $s,t$ are source and target maps: they specify the unique state where a history $h \in H$ starts and where it ends.

If $C$ has pullbacks (or at least any we may need), we can use them to compose spans:

$X \stackrel{s_1}{\leftarrow} H_1 \stackrel{t_1}{\rightarrow} Y \stackrel{s_2}{\leftarrow} H_2 \stackrel{t_2}{\rightarrow} Z \stackrel{\circ}{\Longrightarrow} X \stackrel{S}{\leftarrow} H_1 \times_Y H_2 \stackrel{T}{\rightarrow} Z$

The pullback $H_1 \times_Y H_2$ – a fibred product if we’re in $Sets$ – picks out pairs of histories in $H_1 \times H_2$ which match at $Y$. This should be exactly the possible histories taking $X$ to $Z$.

I’ve included an arrow to the category $Rel$: this is the category whose objects are sets, and whose morphisms are relations. A number of people at CQC mentioned $Rel$ as an example of a monoidal category which supports toy models having some but not all features of quantum mechanics. It happens to be a quotient of $Span(Sets)$. A relation is an equivalence class of spans, where we only notice whether the set of histories connecting $x \in X$ to $y \in Y$ is empty or not. $Span(Sets)$ is more like quantum mechanics, because its composition is just like matrix multiplication: counting the number of histories from $x$ to $y$ turns the span into a $|X| \times |Y|$ matrix – so we can think of $X$ and $Y$ as being like vector spaces.

In fact, there’s a map $L : Span(Sets) \rightarrow Hilb$ taking an object $X$ to $\mathbb{C}^X$ and a span to the matrix I just mentioned, which faithfully represents $Span(Sets)$. A more conceptual way to say this is: a function $f : X \rightarrow \mathbb{C}$ can be transported across the span. It lifts to $H$ as $f \circ s : H \rightarrow \mathbb{C}$. Getting down the other leg, we add all the contributions of each history ending at a given $y$: $t_*(s \circ f) = \sum_{t(h)=y} f \circ s (h)$.

This “sum over histories” is what matrix multiplication actually is.

Why Groupoids?

The point of groupoids is that they represent sets with a notion of (local) symmetry. A groupoid is a category with invertible morphisms. Each such isomorphism tells us that two states are in some sense “the same”. The beginning example is the “action groupoid” that comes from a group $G$ acting on a set $X$, which we call $X /\!\!/ G$ (or the “weak quotient” of $X$ by $G$).

This suggests how groupoids come into the physical picture – the intuition is that $X$ is the set (or, in later variations, space) of states, and $G$ is a group of symmetries.  For example, $G$ could be a group of coordinate transformations: states which can be transformed into each other by a rotation, say, are formally but not physically different.  The Extended TQFT example comes from the case where $X$ is a set of connections, and $G$ the group of gauge transformations.  Of course, not all physically interesting cases come from a single group action: for the harmonic oscillator, the states (“pure states”) are just energy levels – nonnegative integers.  On each state $n$, there is an action of the permutation group $S_n$ – a “local” symmetry.

One nice thing about groupoids is that one often really only wants to think about them up to equivalence – as a result, it becomes a matter of convention whether formally different but physically indistinguishable states are really considered different.  There’s a side effect, though: $Gpd$ is a 2-category.  In particular, this has two consequences for $Span(Gpd)$: it ought to have 2-morphisms, so we stop thinking about spans up to isomorphism.  Instead, we allow spans of span maps as 2-morphisms.  Also, when composing spans (which are no longer taken up to isomorphism) we have to use a weak pullback, not an ordinary one.  I didn’t have time to say much about the 2-morphism level in the CQC talk, but the slides above do.

In any case, moving into $Span(Gpd)$ means that the arrows in the spans are now functors – in particular, a symmetry of a history$h$  now has to map to a symmetry of the start and end states, $s(h)$ and $t(h)$.  In particular, the functors give homomorphisms of the symmetry groups of each object.

Physics in Hilb and 2Hilb

So the point of the above is really to motivate the claim that there’s a clear physical meaning to groupoids (states and symmetries), and spans of them (putting histories on an even footing with states).  There’s less obvious physical meaning to the usual setting of quantum theory, the category $Hilb$ – but it’s a slightly nicer category than $Span(Gpd)$.  For one thing, there is a concept of a “dual” of a span – it’s the same span, with the roles of $s$ and $t$ interchanged.  However (as Jamie Vicary pointed out to me), it’s not an “adjoint” in $Span(Gpd)$ in the technical sense.  In particular, $Span(Gpd)$ is a symmetric monoidal category, like $Hilb$, but it’s not “dagger compact”, the kind of category all the folks from Oxford like so much.

Now, groupoidification lets us generalize the map $L : Span(Sets) \rightarrow Hilb$ to groupoids making as few changes as possible.  We still use Hilbert space $\mathbb{C}^X$, but now $X$ is the set of isomorphism classes of objects in the groupoid.  The “sum over histories” – in other words, the linear map associated to a span – is found in almost the same way, but histories now have “weights” found using groupoid cardinality (see any of the papers on groupoidification, or my slides above, for the details).  This reproduces a lot of known physics (see my paper on the harmonic oscillator; TQFT’s can also be defined this way).

While this is “as much like” linearization of $Span(Set)$ as possible in some sense, it’s not exactly analogous.  It also is rather violent to the structure of the groupoids: at the level of objects it treats $X /\!\!/ G$ as $X/G$. At the morphism level, it ignores everything about the structure of symmetries in the system except how many of them there are.   Since a groupoid is a category, the more direct analogy for $\mathbb{C}^X$ – the set of functions (fancier versions use, say, $L^2$ functions only) from $X$ to $\mathbb{C}$ is $Hilb^G$ – the category of functors from a groupoid into $Hilb$.  That is, representations of $X$.

One of the attractions here is that, because of a generalization of Tanaka-Krein duality, this category will actually be enough to reconstruct the groupoid if it’s reasonably nice.  The representation of $Span(Gpd)$ in $2Hilb$, unlike in $Hilb$ is actually faithful for objects, at least for compact or finite groupoids.

Then you can “pull and push” a representation$F$ across a span to get $t_*(F \circ s)$ – using $t_*$, the adjoint functor to pulling back.  This is the 1-morphism level of the 2-functor I call $\Lambda$, generalizing the functor $L$ in the world of sets.  The result is still a “direct sum over histories” – but because we’re dealing with pushing representations through homomorphisms, this adjoint is a bit more complicated than in the 0-category world of $\mathbb{C}$.  (See my slides or paper for the details).  But it remains true that the weights and so forth used in ordinary groupoidification show up here at the level of 2-morphisms.  So the representation in $2Hilb$ is not a faithful representation of the (intuitively meaningful) category $Span(Gpd)$ either.  But it does capture a fair bit more than Hilbert spaces.

One point of my talk was to try to motivate the use of 2-Hilbert spaces in physics from an a-priori point of view.  One thing I think is nice, for this purpose, is to see how our physical intuitions motivate $Span(Gpd)$ – a nice point itself – and then observe that there is this “higher level” span around:

$Hilb \stackrel{|\cdot |}{\leftarrow} Span(Gpd) \stackrel{\Lambda}{\rightarrow} 2Hilb$

Further Thoughts

Where can one take this?  There seem to be theories whose states and symmetries naturally want to form n-groupoids: in “higher gauge theory“, a sort of  gauge theory for categorical groups, one would have connections as states, gauge transformations as symmetries, and some kind of  “symmetry of symmetries”, rather as 2-categories have functors, natural transformations between them, and modifications of these.  Perhaps these could be organized into n-dimensional spans-of-spans-of-spans… of n-groupoids.  Then representations of an n-groupoid – namely, n-functors into $(n-1)-Hilb$ – could be subjected to the kind of “pull-push” process we’ve just looked at.

Finally, part of the point here was to see how some fundamental physical notions – symmetry and histories – appear across physics, and lead to $Span(Gpd)$.  Presumably these two aren’t enough.  The next principle that looks appealing – because it appears across domains – is some form of an action principle.

But that would be a different talk altogether.

Next Page »