Why Higher Geometric Quantization

The largest single presentation was a pair of talks on “The Motivation for Higher Geometric Quantum Field Theory” by Urs Schreiber, running to about two and a half hours, based on these notes. This was probably the clearest introduction I’ve seen so far to the motivation for the program he’s been developing for several years. Broadly, the idea is to develop a higher-categorical analog of geometric quantization (GQ for short).

One guiding idea behind this is that we should really be interested in quantization over (higher) stacks, rather than merely spaces. This leads inexorably to a higher-categorical version of GQ itself. The starting point, though, is that the defining features of stacks capture two crucial principles from physics: the gauge principle, and locality. The gauge principle means that we need to keep track not just of connections, but gauge transformations, which form respectively the objects and morphisms of a groupoid. “Locality” means that these groupoids of configurations of a physical field on spacetime is determined by its local configuration on regions as small as you like (together with information about how to glue together the data on small regions into larger regions).

Some particularly simple cases can be described globally: a scalar field gives the space of all scalar functions, namely maps into \mathbb{C}; sigma models generalise this to the space of maps \Sigma \rightarrow M for some other target space. These are determined by their values pointwise, so of course are local.

More generally, physicists think of a field theory as given by a fibre bundle V \rightarrow \Sigma (the previous examples being described by trivial bundles \pi : M \times \Sigma \rightarrow \Sigma), where the fields are sections of the bundle. Lagrangian physics is then described by a form on the jet bundle of V, i.e. the bundle whose fibre over p \in \Sigma consists of the space describing the possible first k derivatives of a section over that point.

More generally, a field theory gives a procedure F for taking some space with structure – say a (pseudo-)Riemannian manifold \Sigma – and produce a moduli space X = F(\Sigma) of fields. The Sigma models happen to be representable functors: F(\Sigma) = Maps(\Sigma,M) for some M, the representing object. A prestack is just any functor taking \Sigma to a moduli space of fields. A stack is one which has a “descent condition”, which amounts to the condition of locality: knowing values on small neighbourhoods and how to glue them together determines values on larger neighborhoods.

The Yoneda lemma says that, for reasonable notions of “space”, the category \mathbf{Spc} from which we picked target spaces M embeds into the category of stacks over \mathbf{Spc} (Riemannian manifolds, for instance) and that the embedding is faithful – so we should just think of this as a generalization of space. However, it’s a generalization we need, because gauge theories determine non-representable stacks. What’s more, the “space” of sections of one of these fibred stacks is also a stack, and this is what plays the role of the moduli space for gauge theory! For higher gauge theories, we will need higher stacks.

All of the above is the classical situation: the next issue is how to quantize such a theory. It involves a generalization of Geometric Quantization (GQ for short). Now a physicist who actually uses GQ will find this perspective weird, but it flows from just the same logic as the usual method.

In ordinary GQ, you have some classical system described by a phase space, a manifold X equipped with a pre-symplectic 2-form \omega \in \Omega^2(X). Intuitively, \omega describes how the space, locally, can be split into conjugate variables. In the phase space for a particle in n-space, these “position” and “momentum” variables, and \omega = \sum_x dx^i \wedge dp^i; many other systems have analogous conjugate variables. But what really matters is the form \omega itself, or rather its cohomology class.

Then one wants to build a Hilbert space describing the quantum analog of the system, but in fact, you need a little more than (X,\omega) to do this. The Hilbert space is a space of sections of some bundle whose sections look like copies of the complex numbers, called the “prequantum line bundle“. It needs to be equipped with a connection, whose curvature is a 2-form in the class of \omega: in general, . (If \omega is not symplectic, i.e. is degenerate, this implies there’s some symmetry on X, in which case the line bundle had better be equivariant so that physically equivalent situations correspond to the same state). The easy case is the trivial bundle, so that we get a space of functions, like L^2(X) (for some measure compatible with \omega). In general, though, this function-space picture only makes sense locally in X: this is why the choice of prequantum line bundle is important to the interpretation of the quantized theory.

Since the crucial geometric thing here is a bundle over the moduli space, when the space is a stack, and in the context of higher gauge theory, it’s natural to seek analogous constructions using higher bundles. This would involve, instead of a (pre-)symplectic 2-form \omega, an (n+1)-form called a (pre-)n-plectic form (for an introductory look at this, see Chris Rogers’ paper on the case n=2 over manifolds). This will give a higher analog of the Hilbert space.

Now, maps between Hilbert spaces in QG come from Lagrangian correspondences – these might be maps of moduli spaces, but in general they consist of a “space of trajectories” equipped with maps into a space of incoming and outgoing configurations. This is a span of pre-symplectic spaces (equipped with pre-quantum line bundles) that satisfies some nice geometric conditions which make it possible to push a section of said line bundle through the correspondence. Since each prequantum line bundle can be seen as maps out of the configuration space into a classifying space (for U(1), or in general an n-group of phases), we get a square. The action functional is a cell that fills this square (see the end of 2.1.3 in Urs’ notes). This is a diagrammatic way to describe the usual GQ construction: the advantage is that it can then be repeated in the more general setting without much change.

This much is about as far as Urs got in his talk, but the notes go further, talking about how to extend this to infinity-stacks, and how the Dold-Kan correspondence tells us nicer descriptions of what we get when linearizing – since quantization puts us into an Abelian category.

I enjoyed these talks, although they were long and Urs came out looking pretty exhausted, because while I’ve seen several others on this program, this was the first time I’ve seen it discussed from the beginning, with a lot of motivation. This was presumably because we had a physically-minded part of the audience, whereas I’ve mostly seen these for mathematicians, and usually they come in somewhere in the middle and being more time-limited miss out some of the details and the motivation. The end result made it quite a natural development. Overall, very helpful!

Continuing from the previous post, we’ll take a detour in a different direction. The physics-oriented talks were by Martin Wolf, Sam Palmer, Thomas Strobl, and Patricia Ritter. Since my background in this subject isn’t particularly physics-y, I’ll do my best to summarize the ones that had obvious connections to other topics, but may be getting things wrong or unbalanced here…

Dirac Sigma Models

Thomas Strobl’s talk, “New Methods in Gauge Theory” (based on a whole series of papers linked to from the conference webpage), started with a discussion of of generalizing Sigma Models. Strobl’s talk was a bit high-level physics for me to do it justice, but I came away with the impression of a fairly large program that has several points of contact with more mathematical notions I’ll discuss later.

In particular, Sigma models are physical theories in which a field configuration on spacetime \Sigma is a map X : \Sigma \rightarrow M into some target manifold, or rather (M,g), since we need a metric to integrate and find differentials. Given this, we can define the crucial physics ingredient, an action functional
S[X] = \int_{\Sigma} g_{ij} dX^i \wedge (\star d X^j)
where the dX^i are the differentials of the map into M.

In string theory, \Sigma is the world-sheet of a string and M is ordinary spacetime. This generalizes the simpler example of a moving particle, where \Sigma = \mathbb{R} is just its worldline. In that case, minimizing the action functional above says that the particle moves along geodesics.

The big generalization introduced is termed a “Dirac Sigma Model” or DSM (the paper that introduces them is this one).

In building up to these DSM, a different generalization notes that if there is a group action G \rhd M that describes “rigid” symmetries of the theory (for Minkowski space we might pick the Poincare group, or perhaps the Lorentz group if we want to fix an origin point), then the action functional on the space Maps(\Sigma,M) is invariant in the direction of any of the symmetries. One can use this to reduce (M,g), by “gauging out” the symmetries to get a quotient (N,h), and get a corresponding S_{gauged} to integrate over N.

To generalize this, note that there’s an action groupoid associated with G \rhd M, and replace this with some other (Poisson) groupoid instead. That is, one thinks of the real target for a gauge theory not as M, but the action groupoid M \/\!\!\/ G, and then just considers replacing this with some generic groupoid that doesn’t necessarily arise from a group of rigid symmetries on some underlying M. (In this regard, see the second post in this series, about Urs Schreiber’s talk, and stacks as classifying spaces for gauge theories).

The point here seems to be that one wants to get a nice generalization of this situation – in particular, to be able to go backward from N to M, to deal with the possibility that the quotient N may be geometrically badly-behaved. Or rather, given (N,h), to find some (M,g) of which it is a reduction, but which is better behaved. That means needing to be able to treat a Sigma model with symmetry information attached.

There’s also an infinitesimal version of this: locally, invariance means the Lie derivative of the action in the direction of any of the generators of the Lie algebra of G – so called Killing vectors – is zero. So this equation can generalize to a case where there are vectors where the Lie derivative is zero – a so-called “generalized Killing equation”. They may not generate isometries, but can be treated similarly. What they do give, if you integrate these vectors, is a foliation of M. The space of leaves is the quotient N mentioned above.

The most generic situation Thomas discussed is when one has a Dirac structure on M – this is a certain kind of subbundle D \subset TM \oplus T^*M of the tangent-plus-cotangent bundle over M.

Supersymmetric Field Theories

Another couple of physics-y talks related higher gauge theory to some particular physics models, namely N=(2,0) and N=(1,0) supersymmetric field theories.

The first, by Martin Wolf, was called “Self-Dual Higher Gauge Theory”, and was rooted in generalizing some ideas about twistor geometry – here are some lecture notes by the same author, about how twistor geometry relates to ordinary gauge theory.

The idea of twistor geometry is somewhat analogous to the idea of a Fourier transform, which is ultimately that the same space of fields can be described in two different ways. The Fourier transform goes from looking at functions on a position space, to functions on a frequency space, by way of an integral transform. The Penrose-Ward transform, analogously, transforms a space of fields on Minkowski spacetime, satisfying one set of equations, to a set of fields on “twistor space”, satisfying a different set of equations. The theories represented by those fields are then equivalent (as long as the PW transform is an isomorphism).

The PW transform is described by a “correspondence”, or “double fibration” of spaces – what I would term a “span”, such that both maps are fibrations:

P \stackrel{\pi_1}{\leftarrow} K \stackrel{\pi_2}{\rightarrow} M

The general story of such correspondences is that one has some geometric data on P, which we call Ob_P – a set of functions, differential forms, vector bundles, cohomology classes, etc. They are pulled back to K, and then “pushed forward” to M by a direct image functor. In many cases, this is given by an integral along each fibre of the fibration \pi_2, so we have an integral transform. The image of Ob_P we call Ob_M, and it consists of data satisfying, typically, some PDE’s.In the case of the PW transform, P is complex projective 3-space \mathbb{P}^3/\mathbb{P}^1 and Ob_P is the set of holomorphic principal G bundles for some group G; M is (complexified) Minkowski space \mathbb{C}^4 and the fields are principal G-bundles with connection. The PDE they satisfy is F = \star F, where F is the curvature of the bundle and \star is the Hodge dual). This means cohomology on twistor space (which classifies the bundles) is related self-dual fields on spacetime. One can also find that a point in M corresponds to a projective line in P, while a point in P corresponds to a null plane in M. (The space K = \mathbb{C}^4 \times \mathbb{P}^1).

Then the issue to to generalize this to higher gauge theory: rather than principal G-bundles for a group, one is talking about a 2-group \mathcal{G} with connection. Wolf’s talk explained how there is a Penrose-Ward transform between a certain class of higher gauge theories (on the one hand) and an N=(2,0) supersymmetric field theory (on the other hand). Specifically, taking M = \mathbb{C}^6, and P to be (a subspace of) 6D projective space \mathbb{P}^7 / \mathbb{P}^1, there is a similar correspondence between certain holomorphic 2-bundles on P and solutions to some self-dual field equations on M (which can be seen as constraints on the curvature 3-form F for a principal 2-bundle: the self-duality condition is why this only makes sense in 6 dimensions).

This picture generalizes to supermanifolds, where there are fermionic as well as bosonic fields. These turn out to correspond to a certain 6-dimensional N = (2,0) supersymmetric field theory.

Then Sam Palmer gave a talk in which he described a somewhat similar picture for an N = (1,0) supersymmetric theory. However, unlike the N=(2,0) theory, this one gives, not a higher gauge theory, but something that superficially looks similar, but in fact is quite different. It ends up being a theory of a number of fields – form valued in three linked vector spaces

\mathfrak{g}^* \stackrel{g}{\rightarrow} \mathfrak{h} \stackrel{h}{\rightarrow} \mathfrak{g}

equipped with a bunch of maps that give the whole setup some structure. There is a collection of seven fields in groups (“multiplets”, in physics jargon) valued in each of these spaces. They satisfy a large number of identities. It somewhat resembles the higher gauge theory that corresponds to the N=(1,0) case, so this situation gets called a “(1,0)-gauge model”.

There are some special cases of such a setup, including Courant-Dorfman algebras and Lie 2-algebras. The talk gave quite a few examples of solutions to the equations that fall out. The overall conclusion is that, while there are some similarities between (1,0)-gauge models and the way Higher Gauge Theory appears at the level of algebra-valued forms and the equations they must satisfy, there are some significant differences. I won’t try to summarize this in more depth, because (a) I didn’t follow the nitty-gritty technical details very well, and (b) it turns out to be not HGT, but some new theory which is less well understood at summary-level.

So I spent a few weeks at the Erwin Schrodinger Institute in Vienna, doing a short residence as part of the program “Modern Trends in Topological Quantum Field Theory” leading up to a workshop this week. There were quite a few interesting talks – some on topics that I’ve written about elsewhere in this blog, so I’ll gloss over those. For example, Catherine Meusburger spoke about the project with Barrett and Schaumann to give a diagrammatic language for Gray categories with duals – I’ve written about John Barrett’s talks on this elsewhere. Similarly, I’ve written about Chris Schommer-Pries’ talks about fully-extended TQFT’s and the cobordism hypothesis for structured cobordisms . I’d like to just describe some of the other highlights that connect nicely to themes I find interesting. In Part 1 of this post, the more topological themes…

TQFTs with Boundary

On the first day, Kevin Walker gave a talk called “Premodular TQFTs” which was quite interesting. The key idea here is that a fairly big class of different constructions of 3D TQFT’s turn out to actually be aspects of one 4D TQFT, which comes about by a construction based on the 3D construction of Crane-Yetter-Kauffman.  The term “premodular” refers to the fact that 3D TQFT’s can be related to modular tensor categories. “Tensor” includes several concepts, like being abelian, having vector spaces of morphisms, a monoidal structure that gets along with these – typical examples being the categories of vector spaces, or of representations of some fixed group. “Modular” means that there is a braiding, and that a certain string diagram (which looks like two linked rings) built using the braiding can be represented as an invertible matrix. These will show up as a special case of the “premodular” theory.

The basic idea is to use an approach that is based on local fields (which respects the physics-land concept of what “field theory” means), avoids the path integral approach (which is hard to make rigorous), and can be shown to connect back to the Atyiah-Singer approach in which a TQFT is a kind of functor out of a cobordism category.

That is, given a manifold X we must be able to find the fields on X, called F(X). For example, F(X) could be the maps into a classifying space BG, for a gauge theory, or a category of diagrams on X with labels in some appropriate sort of category. Then one has some relations which say when given fields are the same. For each manifold Y, this defines a vector space of linear combinations of fields, modulo relations, called A(Y;c), where c \in F(\partial Y). The dual space of A(Y;c) is called Z(Y;c) – in keeping with the principle that quantum states are functionals that we can evaluate on “classical” fields.

Walker’s talk develops, from this starting point, a view that includes a whole range of theories – the Dijkgraaf-Witten model (fields are maps to BG); diagrams in a semisimple 1-category (“Euler characteristic theory”), in a pivotal 2-category (a Turaev-Viro model), or a premodular 3-category (a “Crane-Yetter model”), among others. In particular, some familiar theories appear as living on 3D boundaries to a 4D manifold, where such a  premodular theory is defined. The talk goes on to describe a kind of “theory with defects”, where two different theories live on different parts of a manifold (this is a common theme to a number of the talks), and in particular it describes a bimodule which gives a Morita equivalence between two sorts of theory – one based on graphs labelled in representations of a group G, and the other based on G-connections. The bimodule is, effectively, a kind of “Fourier transform” which relates dimension-k structures on one side to codimension-k structures on the other: a line labelled by a G-representation on one side gets acted upon by G-holonomies for a hypersurface on the other side.

On a related note Alessandro Valentino gave a talk called “Boundary Conditions for 3d TQFT and module categories” This related to a couple of papers with Jurgen Fuchs and Christoph Schweigert. The basic idea starts with the fact that one can build (3,2,1)-dimensional TQFT’s from modular tensor categories \mathcal{C}, getting a Reshitikhin-Turaev type theory which assigns \mathcal{C} to the circle. The modular tensor structure tells you what gets assigned to higher-dimensional cobordisms. (This is a higher-categorical analog of the fact that a (2,1)-dimensional TQFT is determined by a Frobenius algebra). Then the motivating question is: how can we extend this theory all the way down to a point (i.e. have it assign something to a point, so that \mathcal{C} is somehow composed of naturally occurring morphisms).

So the question is: if we know what \mathcal{C} is, what does that tell us about the “colours” that could be assigned to a boundary. There’s a fairly elegant way to take on this question by looking at what’s assigned to Wilson lines, the observables that matter in defining RT-type theories, when the line where we’re observing gets pushed onto the boundary. (See around p14 of the first paper linked above). The colours on lines inside the manifold could be objects of \mathcal{C}, and fusing them illustrates the monoidal structure of \mathcal{C}. Then the question is what kind of category can be attached to a boundary and be consistent with this.This should be functorial with respect to fusing two lines (i.e. doing this before or after projecting to the boundary should be the same).

They don’t completely characterize the situation, but they give some reasonable arguments which suggest that the result is that the boundary category, a braided monoidal category, ought to be the Drinfel’d centre of something. This is actually a stronger constraint for categories than groups (any commutative group is the centre of something – namely itself – but this isn’t true for monoidal categories).


Joost Slingerland gave a talk called “Local Representations of the Loop Braid Group”, which was quite nice. The Loop Braid Group was introduced by the late Xiao-Song Lin (whom I had the pleasure to know at UCR) as an interesting generalization of the braid group B_n. B_n is the “motion group” of isomorphism classes of motions of n particles in a plane: in such a motion, we let the particles move around arbitrarily, before ending up occupying the same points occupied initially. (In the “pure braid group”, each individual point must end up where it started – in the braid group, they can swap places). Up to diffeomorphism, this keeps track of how they move around each other – not just how they exchange places, but which one crosses in front of which, etc. The loop braid group does the same for loops embedded in 3D space. Now, if the loops always stay far away from each other, one possibility is that a motion amounts to a permutation in which the loops switch places: two paths through 3D space (or 4D spacetime) can always be untangled. On the other hand, loops can pass THROUGH each other, as seen at the beginning of this video:

This is analogous to two points braiding in 2D space (i.e. strands twisting around each other in 3D spacetime), although in fact these “slide moves” form a group which is different from just the pure braid group – but PB_n fits inside them. In particular, the slide moves satisfy some of the same relations as the braid group – the Yang-Baxter equations.

The final thing that can happen is that loops might move, “flip over”, and return to their original position with reversed orientation. So the loop braid group can be broken down as LB_n = Slide_n \rtimes (\mathbb{Z}_2)^n \rtimes S_n. Every loop braid could be “closed up” to a 4D knotted surface, though not every knotted surface would be of this form. For one thing, our loops have a trivial embedding in 3D space here – to get every possible knotted surface, we’d need to have knots and links sliding around, braiding through each other, merging and splitting, etc. Knotted surfaces are much more complex than knotted circles, just as the topology of embedded circles is more complex than that of embedded points.

The talk described some work on the “local representations” of LB_n: representations on spaces where each loop is attached some k-dimensional vector space V (this is the “local dimension”), so that the motions of n loops gets represented on V^{\otimes n} (a tensor product of n copies of V). This is already rather complex, but is much easier than looking for arbitrary representations of LB_n on any old vector space (“nonlocal” representations, if you like). Now, in particular, for local dimension 2, this boils down to some simple matrices which can be worked out – the slide moves are either represented by some permutation matrices, or some tensor products of rotation matrices, or a few other cases which can all be classified.

Toward the end, Dror Bar-Natan also gave a talk that touched on knotted surfaces, called “A Partial Reduction of BF Theory to Combinatorics“. The mention of BF theory – a kind of higher gauge theory that can be described locally in terms of a 1-form and a 2-form on a manifold – is basically to set up some discussion of knotted surfaces (the combinatorics it reduces to). The point is that, like many field theories, BF theory amplitudes can be calculated using a sum over certain Feynman diagrams – but these ones are diagrams that lie partly in certain knotted surfaces. (See the rather remarkable handout in the link above for lots of pictures). This is sort of analogous to how some gauge theories in 3D boil down to knot invariants – for knots that live on the boundary of a region cut out of the 3-manifold. This is similar, for a knotted surface in a 4-manifold.

The “combinatorics” boils down to showing some diagram presentations of these knotted surfaces – particularly, a special type called a “ribbon knot”, which is a certain kind of knotted sphere. The combinatorics show that these special knotted surfaces all correspond to ordinary knotted circles in 3D (in the handout, you’ll see the Gauss diagram for a knot – a picture which shows which points along a line cross over or under each other in a presentation of the knot – used to construct a corresponding ribbon knot). But do check out the handout for some pictures which show several different ways of presenting 2-knots.

(…To be continued in Part 2…)


Since I moved to Hamburg,   Alessandro Valentino and I have been organizing one series of seminar talks whose goal is to bring people (mostly graduate students, and some postdocs and others) up to speed on the tools used in Jacob Lurie’s big paper on the classification of TQFT and proof of the Cobordism Hypothesis.  This is part of the Forschungsseminar (“research seminar”) for the working groups of Christoph Schweigert, Ingo Runkel, and Christoph Wockel.  First, I gave one introducing myself and what I’ve done on Extended TQFT. In our main series We’ve had a series of four so far – two in which Alessandro outlined a sketch of what Lurie’s result is, and another two by Sebastian Novak and Marc Palm that started catching our audience up on the simplicial methods used in the theory of (\infty,n)-categories which it uses.  Coming up in the New Year, Nathan Bowler and I will be talking about first (\infty,1)-categories, and then (\infty,n)-categories.   I’ll do a few posts summarizing the talks around then.

Some people in the group have done some work on quantum field theories with defects, in relation to which, there’s this workshop coming up here in February!  The idea here is that one could have two regions of space where different field theories apply, which are connected along a boundary. We might imagine these are theories which are different approximations to what’s going on physically, with a different approximation useful in each region.  Whatever the intuition, the regions will be labelled by some category, and boundaries between regions are labelled by functors between categories.  Where different boundary walls meet, one can have natural transformations.  There’s a whole theory of how a 3D TQFT can be associated to modular tensor categories, in sort of the same sense that a 2D TQFT is associated to a Frobenius algebra. This whole program is intimately connected with the idea of “extending” a given TQFT, in the sense that it deals with theories that have inputs which are spaces (or, in the case of defects, sub-spaces of given ones) of many different dimensions.  Lurie’s paper describing the n-dimensional cobordism category, is very much related to the input to a theory like this.

Brno Visit

This time, I’d like to mention something which I began working on with Roger Picken in Lisbon, and talked about for the first time in Brno, Czech Republic, where I was invited to visit at Masaryk University.  I was in Brno for a week or so, and on Thursday, December 13, I gave this talk, called “Higher Gauge Theory and 2-Group Actions”.  But first, some pictures!

This fellow was near the hotel I stayed in:


Since this sculpture is both faceless and hard at work on nonspecific manual labour, I assume he’s a Communist-era artwork, but I don’t really know for sure.

The Christmas market was on in Náměstí Svobody (Freedom Square) in the centre of town.  This four-headed dragon caught my eye:


On the way back from Brno to Hamburg, I met up with my wife to spend a couple of days in Prague.  Here’s the Christmas market in the Old Town Square of Prague:


Anyway, it was a good visit to the Czech Republic.  Now, about the talk!

Moduli Spaces in Higher Gauge Theory

The motivation which I tried to emphasize is to define a specific, concrete situation in which to explore the concept of “2-Symmetry”.  The situation is supposed to be, if not a realistic physical theory, then at least one which has enough physics-like features to give a good proof of concept argument that such higher symmetries should be meaningful in nature.  The idea is that Higher Gauge theory is a field theory which can be understood as one in which the possible (classical) fields on a space/spacetime manifold consist of maps from that space into some target space X.  For the topological theory, they are actually just homotopy classes of maps.  This is somewhat related to Sigma models used in theoretical physics, and mathematically to Homotopy Quantum Field Theory, which considers these maps as geometric structure on a manifold.  An HQFT is a functor taking such structured manifolds and cobordisms into Hilbert spaces and linear maps.  In the paper Roger and I are working on, we don’t talk about this stage of the process: we’re just considering how higher-symmetry appears in the moduli spaces for fields of this kind, which we think of in terms of Higher Gauge Theory.

Ordinary topological gauge theory – the study of flat connections on G-bundles for some Lie group G, can be looked at this way.  The target space X = BG is the “classifying space” of the Lie group – homotopy classes of maps in Hom(M,BG) are the same as groupoid homomorphisms in Hom(\Pi_1(M),G).  Specifically, the pair of functors \Pi_1 and B relating groupoids and topological spaces are adjoints.  Now, this deals with the situation where X = BG is a homotopy 1-type, which is to say that it has a fundamental groupoid \Pi_1(X) = G, and no other interesting homotopy groups.  To deal with more general target spaces X, one should really deal with infinity-groupoids, which can capture the whole homotopy type of X – in particular, all its higher homotopy groups at once (and various relations between them).  What we’re talking about in this paper is exactly one step in that direction: we deal with 2-groupoids.

We can think of this in terms of maps into a target space X which is a 2-type, with nontrivial fundamental groupoid \Pi_1(X), but also interesting second homotopy group \pi_2(X) (and nothing higher).  These fit together to make a 2-groupoid \Pi_2(X), which is a 2-group if X is connected.  The idea is that X is the classifying space of some 2-group \mathcal{G}, which plays the role of the Lie group G in gauge theory.  It is the “gauge 2-group”.  Homotopy classes of maps into X = B \mathcal{G} correspond to flat connections in this 2-group.

For practical purposes, we use the fact that there are several equivalent ways of describing 2-groups.  Two very directly equivalent ways to define them are as group objects internal to \mathbf{Cat}, or as categories internal to \mathbf{Grp} – which have a group of objects and a group of morphisms, and group homomorphisms that define source, target, composition, and so on.  This second way is fairly close to the equivalent formulation as crossed modules (G,H,\rhd,\partial).  The definition is in the slides, but essentially the point is that G is the group of objects, and with the action G \rhd H, one gets the semidirect product G \ltimes H which is the group of morphisms.  The map \partial : H \rightarrow G makes it possible to speak of G and H acting on each other, and that these actions “look like conjugation” (the precise meaning of which is in the defining properties of the crossed module).

The reason for looking at the crossed-module formulation is that it then becomes fairly easy to understand the geometric nature of the fields we’re talking about.  In ordinary gauge theory, a connection can be described locally as a 1-form with values in Lie(G), the Lie algebra of G.  Integrating such forms along curves gives another way to describe the connection, in terms of a rule assigning to every curve a holonomy valued in G which describes how to transport something (generally, a fibre of a bundle) along the curve.  It’s somewhat nontrivial to say how this relates to the classic definition of a connection on a bundle, which can be described locally on “patches” of the manifold via 1-forms together with gluing functions where patches overlap.  The resulting categories are equivalent, though.

In higher gauge theory, we take a similar view. There is a local view of “connections on gerbes“, described by forms and gluing functions (the main difference in higher gauge theory is that the gluing functions related to higher cohomology).  But we will take the equivalent point of view where the connection is described by G-valued holonomies along paths, and H-valued holonomies over surfaces, for a crossed module (G,H,\rhd,\partial), which satisfy some flatness conditions.  These amount to 2-functors of 2-categories \Pi_2(M) \rightarrow \mathcal{G}.

The moduli space of all such 2-connections is only part of the story.  2-functors are related by natural transformations, which are in turn related by “modifications”.  In gauge theory, the natural transformations are called “gauge transformations”, and though the term doesn’t seem to be in common use, the obvious term for the next layer would be “gauge modifications”. It is possible to assemble a 2-groupoid Hom(\Pi_2(M),\mathcal{G}, whose space of objects is exactly the moduli space of 2-connections, and whose 1- and 2-morphisms are exactly these gauge transformations and modifications.  So the question is, what is the meaning of the extra information contained in the 2-groupoid which doesn’t appear in the moduli space itself?

Our claim is that this information expresses how the moduli space carries “higher symmetry”.

2-Group Actions and the Transformation Double Category

What would it mean to say that something exhibits “higher” symmetry? A rudimentary way to formalize the intuition of “symmetry” is to say that there is a group (of “symmetries”) which acts on some object. One could get more subtle, but this should be enough to begin with. We already noted that “higher” gauge theory uses 2-groups (and beyond into n-groups) in the place of ordinary groups.  So in this context, the natural way to interpret it is by saying that there is an action of a 2-group on something.

Just as there are several equivalent ways to define a 2-group, there are different ways to say what it means for it to have an action on something.  One definition of a 2-group is to say that it’s a 2-category with one object and all morphisms and 2-morphisms invertible.  This definition makes it clear that a 2-group has to act on an object of some 2-category \mathcal{C}. For our purposes, just as we normally think of group actions on sets, we will focus on 2-group actions on categories, so that \mathcal{C} = \mathbf{Cat} is the 2-category of interest. Then an action is just a map:

\Phi : \mathcal{G} \rightarrow \mathbf{Cat}

The unique object of \mathcal{G} – let’s call it \star, gets taken to some object \mathbf{C} = \Phi(\star) \in \mathbf{Cat}.  This object \mathbf{C} is the thing being “acted on” by \mathcal{G}.  The existence of the action implies that there are automorphisms \Phi(g) : \mathbf{C} \rightarrow \mathbf{C} for every morphism in \mathbf{G} (which correspond to the elements of the group G of the crossed module).  This would be enough to describe ordinary symmetry, but the higher symmetry is also expressed in the images of 2-morphisms \Phi( \eta : g \rightarrow g') = \Phi(\eta) : \Phi(g) \rightarrow \Phi(g'), which we might call 2-symmetries relating 1-symmetries.

What we want to do in our paper, which the talk summarizes, is to show how this sort of 2-group action gives rise to a 2-groupoid (actually, just a 2-category when the \mathbf{C} being acted on is a general category).  Then we claim that the 2-groupoid of connections can be seen as one that shows up in exactly this way.  (In the following, I have to give some credit to Dany Majard for talking this out and helping to find a better formalism.)

To make sense of this, we use the fact that there is a diagrammatic way to describe the transformation groupoid associated to the action of a group G on a set S.  The set of morphisms is built as a pullback of the action map, \rhd : (g,s) \mapsto g(s).


This means that morphisms are pairs (g,s), thought of as going from s to g(s).  The rule for composing these is another pullback.  The diagram which shows how it’s done appears in the slides.  The whole construction ends up giving a cubical diagram in \mathbf{Sets}, whose top and bottom faces are mere commuting diagrams, and whose four other faces are all pullback squares.

To construct a 2-category from a 2-group action is similar. For now we assume that the 2-group action is strict (rather than being given by \Phi a weak 2-functor).  In this case, it’s enough to think of our 2-group \mathcal{G} not as a 2-category, but as a group-object in \mathbf{Cat} – the same way that a 1-group, as well as being a category, can be seen as a group object in \mathbf{Set}.  The set of objects of this category is the group G of morphisms of the 2-category, and the morphisms make up the group G \ltimes H of 2-morphisms.  Being a group object is the same as having all the extra structure making up a 2-group.

To describe a strict action of such a \mathcal{G} on \mathbf{C}, we just reproduce in \mathbf{Cat} the diagram that defines an action in \mathbf{Sets}:


The fact that \rhd is an action just means this commutes. In principle, we could define a weak action, which would mean that this commutes up to isomorphism, but we won’t be looking at that here.

Constructing the same diagram which describes the structure of a transformation groupoid (p29 in the slides for the talk), we get a structure with a “category of objects” and a “category of morphisms”.  The construction in \mathbf{Set} gives us directly a set of morphisms, while S itself is the set of objects. Similarly, in \mathbf{Cat}, the category of objects is just \mathbf{C}, while the construction gives a category of morphisms.

The two together make a category internal to \mathbf{Cat}, which is to say a double category.  By analogy with S / \!\! / G, we call this double category \mathbf{C} / \!\! / \mathcal{G}.

We take \mathbf{C} as the category of objects, as the “horizontal category”, whose morphisms are the horizontal arrows of the double category. The category of morphisms of \mathbf{C} /\!\!/ \mathcal{G} shows up by letting its objects be the vertical arrows of the double category, and its morphisms be the squares.  These look like this:


The vertical arrows are given by pairs of objects (\gamma, x), and just like the transformation 1-groupoid, each corresponds to the fact that the action of \gamma takes x to \gamma \rhd x. Each square (morphism in the category of morphisms) is given by a pair ( (\gamma, \eta), f) of morphisms, one from \mathcal{G} (given by an element in G \rtimes H), and one from \mathbf{C}.

The horizontal arrow on the bottom of this square is:

(\partial \eta) \gamma \rhd f \circ \Phi(\gamma,\eta)_x = \Phi(\gamma,\eta)_y \circ \gamma \rhd f

The fact that these are equal is exactly the fact that \Phi(\gamma,\eta) is a natural transformation.

The double category \mathbf{C} /\!\!/ \mathcal{G} turns out to have a very natural example which occurs in higher gauge theory.

Higher Symmetry of the Moduli Space

The point of the talk is to show how the 2-groupoid of connections, previously described as Hom(\Pi_2(M),\mathcal{G}), can be seen as coming from a 2-group action on a category – the objects of this category being exactly the connections. In the slides above, for various reasons, we did this in a discretized setting – a manifold with a decomposition into cells. This is useful for writing things down explicitly, but not essential to the idea behind the 2-symmetry of the moduli space.

The point is that there is a category we call \mathbf{Conn}, whose objects are the connections: these assign G-holonomies to edges of our discretization (in general, to paths), and H-holonomies to 2D faces. (Without discretization, one would describe these in terms of Lie(G)-valued 1-forms and Lie(H)-valued 2-forms.)

The morphisms of \mathbf{Conn} are one type of “gauge transformation”: namely, those which assign H-holonomies to edges. (Or: Lie(H)-valued 1-forms). They affect the edge holonomies of a connection just like a 2-morphism in \mathcal{G}.  Face holonomies are affected by the H-value that comes from the boundary of the face.

What’s physically significant here is that both objects and morphisms of \mathbf{Conn} describe nonlocal geometric information.  They describe holonomies over edges and surfaces: not what happens at a point.  The “2-group of gauge transformations”, which we call \mathbf{Gauge}, on the other hand, is purely about local transformations.  If V is the vertex set of the discretized manifold, then \mathbf{Gauge} = \mathcal{G}^V: one copy of the gauge 2-group at each vertex.  (Keeping this finite dimensional and avoiding technical details was one main reason we chose to use a discretization.  In principle, one could also talk about the 2-group of \mathcal{G}-valued functions, whose objects and morphisms, thinking of it as a group object in \mathbf{Cat}, are functions valued in morphisms of \mathcal{G}.)

Now, the way \mathbf{Gauge} acts on \mathbf{Conn} is essentially by conjugation: edge holonomies are affected by pre- and post-multiplication by the values at the two vertices on the edge – whether objects or morphisms of \mathbf{Gauge}.  (Face holonomies are unaffected).  There are details about this in the slides, but the important thing is that this is a 2-group of purely local changes.  The objects of \mathbf{Gauge} are gauge transformations of this other type.  In a continuous setting, they would be described by G-valued functions.  The morphisms are gauge modifications, and could be described by H-valued functions.

The main conceptual point here is that we have really distinguished between two kinds of gauge transformation, which are the horizontal and vertical arrows of the double category \mathbf{Conn} /\!\!/ \mathbf{Gauge}.  This expresses the 2-symmetry by moving some gauge transformations into the category of connections, and others into the 2-group which acts on it.  But physically, we would like to say that both are “gauge transformations”.  So one way to do this is to “collapse” the double category to a bicategory: just formally allow horizontal and vertical arrows to compose, so that there is only one kind of arrow.  Squares become 2-cells.

So then if we collapse the double category expressing our 2-symmetry relation this way, the result is exactly equivalent to the functor category way of describing connections.  (The morphisms will all be invertible because \mathbf{Conn} is a groupoid and \mathbf{Gauge} is a 2-group).

I’m interested in this kind of geometrical example partly because it gives a good way to visualize something new happening here.  There appears to be some natural 2-symmetry on this space of fields, which is fairly easy to see geometrically, and distinguishes in a fundamental way between two types of gauge transformation.  This sort of phenomenon doesn’t occur in the world of \mathbf{Sets} – a set S has no morphisms, after all, so the transformation groupoid for a group action on it is much simpler.

In broad terms, this means that 2-symmetry has qualitatively new features that familiar old 1-symmetry doesn’t have.  Higher categorical versions – n-groups acting on n-groupoids, as might show up in more complicated HQFT – will certainly be even more complicated.  The 2-categorical version is just the first non-trivial situation where this happens, so it gives a nice starting point to understand what’s new in higher symmetry that we didn’t already know.

This entry is a by-special-request blog, which Derek Wise invited me to write for the blog associated with the International Loop Quantum Gravity Seminar, and it will appear over there as well.  The ILQGS is a long-running regular seminar which runs as a teleconference, with people joining in from various countries, on various topics which are more or less closely related to Loop Quantum Gravity and the interests of people who work on it.  The custom is that when someone gives a talk, someone else writes up a description of the talk for the ILQGS blog, and Derek invited me to write up a description of his talk.  The audio file of the talk itself is available in .aiff and .wav formats, and the slides are here.

The talk that Derek gave was based on a project of his and Steffen Gielen’s, which has taken written form in a few papers (two shorter ones, “Spontaneously broken Lorentz symmetry for Hamiltonian gravity“, “Linking Covariant and Canonical General Relativity via Local Observers“, and a new, longer one called “Lifting General Relativity to Observer Space“).

The key idea behind this project is the notion of “observer space”, which is exactly what it sounds like: a space of all observers in a given universe.  This is easiest to picture when one has a spacetime – a manifold with a Lorentzian metric, (M,g) – to begin with.  Then an observer can be specified by choosing a particular point (x_0,x_1,x_2,x_3) = \mathbf{x} in spacetime, as well as a unit future-directed timelike vector v.  This vector is a tangent to the observer’s worldline at \mathbf{x}.  The observer space is therefore a bundle over M, the “future unit tangent bundle”.  However, using the notion of a “Cartan geometry”, one can give a general definition of observer space which makes sense even when there is no underlying (M,g).

The result is a surprising, relatively new physical intuition is that “spacetime” is a local and observer-dependent notion, which in some special cases can be extended so that all observers see the same spacetime.  This is somewhat related to the relativity of locality, which I’ve blogged about previously.  Geometrically, it is similar to the fact that a slicing of spacetime into space and time is not unique, and not respected by the full symmetries of the theory of Relativity, even for flat spacetime (much less for the case of General Relativity).  Similarly, we will see a notion of “observer space”, which can sometimes be turned into a bundle over an objective spacetime M, but not in all cases.

So, how is this described mathematically?  In particular, what did I mean up there by saying that spacetime becomes observer-dependent?

Cartan Geometry

The answer uses Cartan geometry, which is a framework for differential geometry that is slightly broader than what is commonly used in physics.  Roughly, one can say “Cartan geometry is to Klein geometry as Riemannian geometry is to Euclidean geometry”.  The more familiar direction of generalization here is the fact that, like Riemannian geometry, Cartan is concerned with manifolds which have local models in terms of simple, “flat” geometries, but which have curvature, and fail to be homogeneous.  First let’s remember how Klein geometry works.

Klein’s Erlangen Program, carried out in the mid-19th-century, systematically brought abstract algebra, and specifically the theory of Lie groups, into geometry, by placing the idea of symmetry in the leading role.  It describes “homogeneous spaces”, which are geometries in which every point is indistinguishable from every other point.  This is expressed by the existence of a transitive action of some Lie group G of all symmetries on an underlying space.  Any given point x will be fixed by some symmetries, and not others, so one also has a subgroup H = Stab(x) \subset G.  This is the “stabilizer subgroup”, consisting of all symmetries which fix x.  That the space is homogeneous means that for any two points x,y, the subgroups Stab(x) and Stab(y) are conjugate (by a symmetry taking x to y).  Then the homogeneous space, or Klein geometry, associated to (G,H) is, up to isomorphism, just the same as the quotient space G/H of the obvious action of H on G.

The advantage of this program is that it has a great many examples, but the most relevant ones for now are:

  • n-dimensional Euclidean space. the Euclidean group ISO(n) = SO(n) \ltimes \mathbb{R}^n is precisely the group of transformations that leave the data of Euclidean geometry, lengths and angles, invariant.  It acts transitively on \mathbb{R}^n.  Any point will be fixed by the group of rotations centred at that point, which is a subgroup of ISO(n) isomorphic to SO(n).  Klein’s insight is to reverse this: we may define Euclidean space by R^n \cong ISO(n)/SO(n).
  • n-dimensional Minkowski space.  Similarly, we can define this space to be ISO(n-1,1)/SO(n-1,1).  The Euclidean group has been replaced by the Poincaré group, and rotations by the Lorentz group (of rotations and boosts), but otherwise the situation is essentially the same.
  • de Sitter space.  As a Klein geometry, this is the quotient SO(4,1)/SO(3,1).  That is, the stabilizer of any point is the Lorentz group – so things look locally rather similar to Minkowski space around any given point.  But the global symmetries of de Sitter space are different.  Even more, it looks like Minkowski space locally in the sense that the Lie algebras give representations so(4,1)/so(3,1) and iso(3,1)/so(3,1) are identical, seen as representations of SO(3,1).  It’s natural to identify them with the tangent space at a point.  de Sitter space as a whole is easiest to visualize as a 4D hyperboloid in \mathbb{R}^5.  This is supposed to be seen as a local model of spacetime in a theory in which there is a cosmological constant that gives empty space a constant negative curvature.
  • anti-de Sitter space. This is similar, but now the quotient is SO(3,2)/SO(3,1) – in fact, this whole theory goes through for any of the last three examples: Minkowski; de Sitter; and anti-de Sitter, each of which acts as a “local model” for spacetime in General Relativity with the cosmological constant, respectively: zero; positive; and negative.

Now, what does it mean to say that a Cartan geometry has a local model?  Well, just as a Lorentzian or Riemannian manifold is “locally modelled” by Minkowski or Euclidean space, a Cartan geometry is locally modelled by some Klein geometry.  This is best described in terms of a connection on a principal G-bundle, and the associated G/H-bundle, over some manifold M.  The crucial bundle in a Riemannian or Lorenztian geometry is the frame bundle: the fibre over each point consists of all the ways to isometrically embed a standard Euclidean or Minkowski space into the tangent space.  A connection on this bundle specifies how this embedding should transform as one moves along a path.  It’s determined by a 1-form on M, valued in the Lie algebra of G.

Given a parametrized path, one can apply this form to the tangent vector at each point, and get a Lie algebra-valued answer.  Integrating along the path, we get a path in the Lie group G (which is independent of the parametrization).  This is called a “development” of the path, and by applying the G-values to the model space G/H, we see that the connection tells us how to move through a copy of G/H as we move along the path.  The image this suggests is of “rolling without slipping” – think of the case where the model space is a sphere.  The connection describes how the model space “rolls” over the surface of the manifold M.  Curvature of the connection measures the failure to commute of the processes of rolling in two different directions.  A connection with zero curvature describes a space which (locally at least) looks exactly like the model space: picture a sphere rolling against its mirror image.  Transporting the sphere-shaped fibre around any closed curve always brings it back to its starting position. Now, curvature is defined in terms of transports of these Klein-geometry fibres.  If curvature is measured by the development of curves, we can think of each homogeneous space as a flat Cartan geometry with itself as a local model.

This idea, that the curvature of a manifold depends on the model geometry being used to measure it, shows up in the way we apply this geometry to physics.

Gravity and Cartan Geometry

MacDowell-Mansouri gravity can be understood as a theory in which General Relativity is modelled by a Cartan geometry.  Of course, a standard way of presenting GR is in terms of the geometry of a Lorentzian manifold.  In the Palatini formalism, the basic fields are a connection A and a vierbein (coframe field) called e, with dynamics encoded in the Palatini action, which is the integral over M of R[\omega] \wedge e \wedge e, where R is the curvature 2-form for \omega.

This can be derived from a Cartan geometry, whose model geometry is de Sitter space SO(4,1)/SO(3,1).   Then MacDowell-Mansouri gravity gets \omega and e by splitting the Lie algebra as so(4,1) = so(3,1) \oplus \mathbb{R^4}.  This “breaks the full symmetry” at each point.  Then one has a fairly natural action on the so(4,1)-connection:

\int_M tr(F_h \wedge \star F_h)

Here, F_h is the so(3,1) part of the curvature of the big connection.  The splitting of the connection means that F_h = R + e \wedge e, and the action above is rewritten, up to a normalization, as the Palatini action for General Relativity (plus a topological term, which has no effect on the equations of motion we get from the action).  So General Relativity can be written as the theory of a Cartan geometry modelled on de Sitter space.

The cosmological constant in GR shows up because a “flat” connection for a Cartan geometry based on de Sitter space will look (if measured by Minkowski space) as if it has constant curvature which is exactly that of the model Klein geometry.  The way to think of this is to take the fibre bundle of homogeneous model spaces as a replacement for the tangent bundle to the manifold.  The fibre at each point describes the local appearance of spacetime.  If empty spacetime is flat, this local model is Minkowski space, ISO(3,1)/SO(3,1), and one can really speak of tangent “vectors”.  The tangent homogeneous space is not linear.  In these first cases, the fibres are not vector spaces, precisely because the large group of symmetries doesn’t contain a group of translations, but they are Klein geometries constructed in just the same way as Minkowski space. Thus, the local description of the connection in terms of Lie(G)-valued forms can be treated in the same way, regardless of which Klein geometry G/H occurs in the fibres.  In particular, General Relativity, formulated in terms of Cartan geometry, always says that, in the absence of matter, the geometry of space is flat, and the cosmological constant is included naturally by the choice of which Klein geometry is the local model of spacetime.

Observer Space

The idea in defining an observer space is to combine two symmetry reductions into one.  The reduction from SO(4,1) to SO(3,1) gives de Sitter space, SO(4,1)/SO(3,1) as a model Klein geometry, which reflects the “symmetry breaking” that happens when choosing one particular point in spacetime, or event.  Then, the reduction of SO(3,1) to SO(3) similarly reflects the symmetry breaking that occurs when one chooses a specific time direction (a future-directed unit timelike vector).  These are the tangent vectors to the worldline of an observer at the chosen point, so SO(3,1)/SO(3) the model Klein geometry, is the space of such possible observers.  The stabilizer subgroup for a point in this space consists of just the rotations of space around the corresponding observer – the boosts in SO(3,1) translate between observers.  So locally, choosing an observer amounts to a splitting of the model spacetime at the point into a product of space and time. If we combine both reductions at once, we get the 7-dimensional Klein geometry SO(4,1)/SO(3).  This is just the future unit tangent bundle of de Sitter space, which we think of as a homogeneous model for the “space of observers”

A general observer space O, however, is just a Cartan geometry modelled on SO(4,1)/SO(3).  This is a 7-dimensional manifold, equipped with the structure of a Cartan geometry.  One class of examples are exactly the future unit tangent bundles to 4-dimensional Lorentzian spacetimes.  In these cases, observer space is naturally a contact manifold: that is, it’s an odd-dimensional manifold equipped with a 1-form \alpha, the contact form, which is such that the top-dimensional form \alpha \wedge d \alpha \wedge \dots \wedge d \alpha is nowhere zero.  This is the odd-dimensional analog of a symplectic manifold.  Contact manifolds are, intuitively, configuration spaces of systems which involve “rolling without slipping” – for instance, a sphere rolling on a plane.  In this case, it’s better to think of the local space of observers which “rolls without slipping” on a spacetime manifold M.

Now, Minkowski space has a slicing into space and time – in fact, one for each observer, who defines the time direction, but the time coordinate does not transform in any meaningful way under the symmetries of the theory, and different observers will choose different ones.  In just the same way, the homogeneous model of observer space can naturally be written as a bundle SO(4,1)/SO(3) \rightarrow SO(4,1)/SO(3,1).  But a general observer space O may or may not be a bundle over an ordinary spacetime manifold, O \rightarrow M.  Every Cartan geometry M gives rise to an observer space O as the bundle of future-directed timelike vectors, but not every Cartan geometry O is of this form, in any natural way. Indeed, without a further condition, we can’t even reconstruct observer space as such a bundle in an open neighborhood of a given observer.

This may be intuitively surprising: it gives a perfectly concrete geometric model in which “spacetime” is relative and observer-dependent, and perhaps only locally meaningful, in just the same way as the distinction between “space” and “time” in General Relativity. It may be impossible, that is, to determine objectively whether two observers are located at the same base event or not. This is a kind of “Relativity of Locality” which is geometrically much like the by-now more familiar Relativity of Simultaneity. Each observer will reach certain conclusions as to which observers share the same base event, but different observers may not agree.  The coincident observers according to a given observer are those reached by a good class of geodesics in O moving only in directions that observer sees as boosts.

When one can reconstruct O \rightarrow M, two observers will agree whether or not they are coincident.  This extra condition which makes this possible is an integrability constraint on the action of the Lie algebra H (in our main example, H = SO(3,1)) on the observer space O.  In this case, the fibres of the bundle are the orbits of this action, and we have the familiar world of Relativity, where simultaneity may be relative, but locality is absolute.

Lifting Gravity to Observer Space

Apart from describing this model of relative spacetime, another motivation for describing observer space is that one can formulate canonical (Hamiltonian) GR locally near each point in such an observer space.  The goal is to make a link between covariant and canonical quantization of gravity.  Covariant quantization treats the geometry of spacetime all at once, by means of a Lagrangian action functional.  This is mathematically appealing, since it respects the symmetry of General Relativity, namely its diffeomorphism-invariance.  On the other hand, it is remote from the canonical (Hamiltonian) approach to quantization of physical systems, in which the concept of time is fundamental. In the canonical approach, one gets a Hilbert space by quantizing the space of states of a system at a given point in time, and the Hamiltonian for the theory describes its evolution.  This is problematic for diffeomorphism-, or even Lorentz-invariance, since coordinate time depends on a choice of observer.  The point of observer space is that we consider all these choices at once.  Describing GR in O is both covariant, and based on (local) choices of time direction.

This is easiest to describe in the case of a bundle O \rightarrow M.  Then a “field of observers” to be a section of the bundle: a choice, at each base event in M, of an observer based at that event.  A field of observers may or may not correspond to a particular decomposition of spacetime into space evolving in time, but locally, at each point in O, it always looks like one.  The resulting theory describes the dynamics of space-geometry over time, as seen locally by a given observer.  In this case, a Cartan connection on observer space is described by to a Lie(SO(4,1))-valued form.  This decomposes into four Lie-algebra valued forms, interpreted as infinitesimal transformations of the model observer by: (1) spatial rotations; (2) boosts; (3) spatial translations; (4) time translation.  The four-fold division is based on two distinctions: first, between the base event at which the observer lives, and the choice of observer (i.e. the reduction of SO(4,1) to SO(3,1), which symmetry breaking entails choosing a point); and second, between space and time (i.e. the reduction of SO(3,1) to SO(3), which symmetry breaking entails choosing a time direction).

This splitting, along the same lines as the one in MacDowell-Mansouri gravity described above, suggests that one could lift GR to a theory on an observer space O.  This amount to describing fields on O and an action functional, so that the splitting of the fields gives back the usual fields of GR on spacetime, and the action gives back the usual action.  This part of the project is still under development, but this lifting has been described.  In the case when there is no “objective” spacetime, the result includes some surprising new fields which it’s not clear how to deal with, but when there is an objective spacetime, the resulting theory looks just like GR.

Well, as promised in the previous post, I’d like to give a summary of some of what was discussed at the conference I attended (quite a while ago now, late last year) in Erlangen, Germany.  I was there also to visit Derek Wise, talking about a project we’ve been working on for some time.

(I’ve also significantly revised this paper about Extended TQFT since then, and it now includes some stuff which was the basis of my talk at Erlangen on cohomological twisting of the category Span(Gpd).  I’ll get to that in the next post.  Also coming up, I’ll be describing some new things I’ve given some talks about recently which relate the Baez-Dolan groupoidification program to Khovanov-Lauda categorification of algebras – at least in one example, hopefully in a way which will generalize nicely.)

In the meantime, there were a few themes at the conference which bear on the Extended TQFT project in various ways, so in this post I’ll describe some of them.  (This isn’t an exhaustive description of all the talks: just of a selection of illustrative ones.)

Categories with Structures

A few talks were mainly about facts regarding the sorts of categories which get used in field theory contexts.  One important type, for instance, are fusion categories is a monoidal category which is enriched in vector spaces, generated by simple objects, and some other properties: essentially, monoidal 2-vector spaces.  The basic example would be categories of representations (of groups, quantum groups, algebras, etc.), but fusion categories are an abstraction of (some of) their properties.  Many of the standard properties are described and proved in this paper by Etingof, Nikshych, and Ostrik, which also poses one of the basic conjectures, the “ENO Conjecture”, which was referred to repeatedly in various talks.  This is the guess that every fusion category can be given a “pivotal” structure: an isomorphism from Id to **.  It generalizes the theorem that there’s always such an isomorphism into ****.  More on this below.

Hendryk Pfeiffer talked about a combinatorial way to classify fusion categories in terms of certain graphs (see this paper here).  One way I understand this idea is to ask how much this sort of category really does generalize categories of representations, or actually comodules.  One starting point for this is the theorem that there’s a pair of functors between certain monoidal categories and weak Hopf algebras.  Specifically, the monoidal categories are (Cat \downarrow Vect)^{\otimes}, which consists of monoidal categories equipped with a forgetful functor into Vect.  Then from this one can get (via a coend), a weak Hopf algebra over the base field k(in the category WHA_k).  From a weak Hopf algebra H, one can get back such a category by taking all the modules of H.  These two processes form an adjunction: they’re not inverses, but we have maps between the two composites and the identity functors.

The new result Hendryk gave is that if we restrict our categories over Vect to be abelian, and the functors between them to be linear, faithful, and exact (that is, roughly, that we’re talking about concrete monoidal 2-vector spaces), then this adjunction is actually an equivalence: so essentially, all such categories C may as well be module categories for weak Hopf algebras.  Then he gave a characterization of these in terms of the “dimension graph” (in fact a quiver) for (C,M), where M is one of the monoidal generators of C.  The vertices of \mathcal{G} = \mathcal{G}_{(C,M)} are labelled by the irreducible representations v_i (i.e. set of generators of the category), and there’s a set of edges j \rightarrow l labelled by a basis of Hom(v_j, v_l \otimes M).  Then one can carry on and build a big graded algebra H[\mathcal{G}] whose m-graded part consists of length-m paths in \mathcal{G}.  Then the point is that the weak Hopf algebra of which C is (up to isomorphism) the module category will be a certain quotient of H[\mathcal{G}] (after imposing some natural relations in a systematic way).

The point, then, is that the sort of categories mostly used in this area can be taken to be representation categories, but in general only of these weak Hopf algebras: groups and ordinary algebras are special cases, but they show up naturally for certain kinds of field theory.

Tensor Categories and Field Theories

There were several talks about the relationship between tensor categories of various sorts and particular field theories.  The idea is that local field theories can be broken down in terms of some kind of n-category: n-dimensional regions get labelled by categories, (n-1)-D boundaries between regions, or “defects”, are labelled by functors between the categories (with the idea that this shows how two different kinds of field can couple together at the defect), and so on (I think the highest-dimension that was discussed explicitly involved 3-categories, so one has junctions between defects, and junctions between junctions, which get assigned some higher-morphism data).  Alteratively, there’s the dual picture where categories are assigned to points, functors to 1-manifolds, and so on.  (This is just Poincaré duality in the case where the manifolds come with a decomposition into cells, which they often are if only for convenience).

Victor Ostrik gave a pair of talks giving an overview role tensor categories play in conformal field theory.  There’s too much material here to easily summarize, but the basics go like this: CFTs are field theories defined on cobordisms that have some conformal structure (i.e. notion of angles, but not distance), and on the algebraic side they are associated with vertex algebras (some useful discussion appears on mathoverflow, but in this context they can be understood as vector spaces equipped with exactly the algebraic operations needed to model cobordisms with some local holomorphic structure).

In particular, the irreducible representations of these VOA’s determine the “conformal blocks” of the theory, which tell us about possible correlations between observables (self-adjoint operators).  A VOA V is “rational” if the category Rep(V) is semisimple (i.e. generated as finite direct sums of these conformal blocks).  For good VOA’s, Rep(V) will be a modular tensor category (MTC), which is a fusion category with a duality, braiding, and some other strucutre (see this for more).   So describing these gives us a lot of information about what CFT’s are possible.

The full data of a rational CFT are given by a vertex algebra, and a module category M: that is, a fusion category is a sort of categorified ring, so it can act on M as an ring acts on a module.  It turns out that choosing an M is equivalent to finding a certain algebra (i.e. algebra object) \mathcal{L}, a “Lagrangian algebra” inside the centre of Rep(V).  The Drinfel’d centre Z(C) of a monoidal category C is a sort of free way to turn a monoidal category into a braided one: but concretely in this case it just looks like Rep(V) \otimes Rep(V)^{\ast}.  Knowing the isomorphism class \mathcal{L} determines a “modular invariant”.  It gets “physics” meaning from how it’s equipped with an algebra structure (which can happen in more than one way), but in any case \mathcal{L} has an underlying vector space, which becomes the Hilbert space of states for the conformal field theory, which the VOA acts on in the natural way.

Now, that was all conformal field theory.  Christopher Douglas described some work with Chris Schommer-Pries and Noah Snyder about fusion categories and structured topological field theories.  These are functors out of cobordism categories, the most important of which are n-categories, where the objects are points, morphisms are 1D cobordisms, and so on up to n-morphisms which are n-dimensional cobordisms.  To keep things under control, Chris Douglas talked about the case Bord_0^3, which is where n=3, and a “local” field theory is a 3-functor Bord_0^3 \rightarrow \mathcal{C} for some 3-category \mathcal{C}.  Now, the (Baez-Dolan) Cobordism Hypothesis, which was proved by Jacob Lurie, says that Bord_0^3 is, in a suitable sense, the free symmetric monoidal 3-category with duals.  What this amounts to is that a local field theory whose target 3-category is \mathcal{C} is “just” a dualizable object of \mathcal{C}.

The handy example which links this up to the above is when \mathcal{C} has objects which are tensor categories, morphisms which are bimodule categories (i.e. categories acted), 2-morphisms which are functors, and 3-morphisms which are natural transformations.  Then the issue is to classify what kind of tensor categories these objects can be.

The story is trickier if we’re talking about, not just topological cobordisms, but ones equipped with some kind of structure regulated by a structure group G(for instance, orientation by G=SO(n), spin structure by its universal cover G= Spin(n), and so on).  This means the cobordisms come equipped with a map into BG.  They take O(n) as the starting point, and then consider groups G with a map to O(n), and require that the map into BG is a lift of the map to BO(n).  Then one gets that a structured local field theory amounts to a dualizable objects of \mathcal{C} with a homotopy-fixed point for some G-action – and this describes what gets assigned to the point by such a field theory.  What they then show is a correspondence between G and classes of categories.  For instance, fusion categories are what one gets by imposing that the cobordisms be oriented.

Liang Kong talked about “Topological Orders and Tensor Categories”, which used the Levin-Wen models, from condensed matter phyiscs.  (Benjamin Balsam also gave a nice talk describing these models and showing how they’re equivalent to the Turaev-Viro and Kitaev models in appropriate cases.  Ingo Runkel gave a related talk about topological field theories with “domain walls”.).  Here, the idea of a “defect” (and topological order) can be understood very graphically: we imagine a 2-dimensional crystal lattice (of atoms, say), and the defect is a 1-dimensional place where the two lattices join together, with the internal symmetry of each breaking down at the boundary.  (For example, a square lattice glued where the edges on one side are offset and meet the squares on the other side in the middle of a face, as you typically see in a row of bricks – the slides linked above have some pictures).  The Levin-Wen models are built using a hexagonal lattice, starting with a tensor category with several properties: spherical (there are dualities satisfying some relations), fusion, and unitary: in fact, historically, these defining properties were rediscovered independently here as the requirement for there to be excitations on the boundary which satisfy physically-inspired consistency conditions.

These abstract the properties of a category of representations.  A generalization of this to “topological orders” in 3D or higher is an extended TFT in the sense mentioned just above: they have a target 3-category of tensor categories, bimodule categories, functors and natural transformations.  The tensor categories (say, \mathcal{C}, \mathcal{D}, etc.) get assigned to the bulk regions; to “domain walls” between different regions, namely defects between lattices, we assign bimodule categories (but, for instance, to a line within a region, we get \mathcal{C} understood as a \mathcal{C}-\mathcal{C}-bimodule); then to codimension 2 and 3 defects we attach functors and natural transformations.  The algebra for how these combine expresses the ways these topological defects can go together.  On a lattice, this is an abstraction of a spin network model, where typically we have just one tensor category \mathcal{C} applied to the whole bulk, namely the representations of a Lie group (say, a unitary group).  Then we do calculations by breaking down into bases: on codimension-1 faces, these are simple objects of \mathcal{C}; to vertices we assign a Hom space (and label by a basis for intertwiners in the special case); and so on.

Thomas Nickolaus spoke about the same kind of G-equivariant Dijkgraaf-Witten models as at our workshop in Lisbon, so I’ll refer you back to my earlier post on that.  However, speaking of equivariance and group actions:

Michael Müger  spoke about “Orbifolds of Rational CFT’s and Braided Crossed G-Categories” (see this paper for details).  This starts with that correspondence between rational CFT’s (strictly, rational chiral CFT’s) and modular categories Rep(F).  (He takes F to be the name of the CFT).  Then we consider what happens if some finite group G acts on F (if we understand F as a functor, this is an action by natural transformations; if as an algebra, then ).  This produces an “orbifold theory” F^G (just like a finite group action on a manifold produces an orbifold), which is the “G-fixed subtheory” of F, by taking G-fixed points for every object, and is also a rational CFT.  But that means it corresponds to some other modular category Rep(F^G), so one would like to know what category this is.

A natural guess might be that it’s Rep(F)^G, where C^G is a “weak fixed-point” category that comes from a weak group action on a category C.  Objects of C^G are pairs (c,f_g) where c \in C and f_g : g(c) \rightarrow c is a specified isomorphism.  (This is a weak analog of S^G, the set of fixed points for a group acting on a set).  But this guess is wrong – indeed, it turns out these categories have the wrong dimension (which is defined because the modular category has a trace, which we can sum over generating objects).  Instead, the right answer, denoted by Rep(F^G) = G-Rep(F)^G, is the G-fixed part of some other category.  It’s a braided crossed G-category: one with a grading by G, and a G-action that gets along with it.  The identity-graded part of Rep(F^G) is just the original Rep(F).

State Sum Models

This ties in somewhat with at least some of the models in the previous section.  Some of these were somewhat introductory, since many of the people at the conference were coming from a different background.  So, for instance, to begin the workshop, John Barrett gave a talk about categories and quantum gravity, which started by outlining the historical background, and the development of state-sum models.  He gave a second talk where he began to relate this to diagrams in Gray-categories (something he also talked about here in Lisbon in February, which I wrote about then).  He finished up with some discussion of spherical categories (and in particular the fact that there is a Gray-category of spherical categories, with a bunch of duals in the suitable sense).  This relates back to the kind of structures Chris Douglas spoke about (described above, but chronologically right after John).  Likewise, Winston Fairbairn gave a talk about state sum models in 3D quantum gravity – the Ponzano Regge model and Turaev-Viro model being the focal point, describing how these work and how they’re constructed.  Part of the point is that one would like to see that these fit into the sort of framework described in the section above, which for PR and TV models makes sense, but for the fancier state-sum models in higher dimensions, this becomes more complicated.

Higher Gauge Theory

There wasn’t as much on this topic as at our own workshop in Lisbon (though I have more remarks on higher gauge theory in one post about it), but there were a few entries.  Roger Picken talked about some work with Joao Martins about a cubical formalism for parallel transport based on crossed modules, which consist of a group G and abelian group H, with a map \partial : H \rightarrow G and an action of G on H satisfying some axioms.  They can represent categorical groups, namely group objects in Cat (equivalently, categories internal to Grp), and are “higher” analogs of groups with a set of elements.  Roger’s talk was about how to understand holonomies and parallel transports in this context.  So, a “connection” lets on transport things with G-symmetries along paths, and with H-symmetries along surfaces.  It’s natural to describe this with squares whose edges are labelled by G-elements, and faces labelled by H-elements (which are the holonomies).  Then the “cubical approach” means that we can describe gauge transformations, and higher gauge transformations (which in one sense are the point of higher gauge theory) in just the same way: a gauge transformation which assigns H-values to edges and G-values to vertices can be drawn via the holonomies of a connection on a cube which extends the original square into 3D (so the edges become squares, and so get H-values, and so on).  The higher gauge transformations work in a similar way.  This cubical picture gives a good way to understand the algebra of how gauge transformations etc. work: so for instance, gauge transformations look like “conjugation” of a square by four other squares – namely, relating the front and back faces of a cube by means of the remaining faces.  Higher gauge transformations can be described by means of a 4D hypercube in an analogous way, and their algebraic properties have to do with the 2D faces of the hypercube.

Derek Wise gave a short talk outlining his recent paper with John Baez in which they show that it’s possible to construct a higher gauge theory based on the Poincare 2-group which turns out to have fields, and dynamics, which are equivalent to teleparallel gravity, a slightly unusal theory which nevertheless looks in practice just like General Relativity.  I discussed this in a previous post.

So next time I’ll talk about the new additions to my paper on ETQFT which were the basis of my talk, which illustrates a few of the themes above.

So I’ve been travelling a lot in the last month, spending more than half of it outside Portugal. I was in Ottawa, Canada for a Fields Institute workshop, “Categorical Methods in Representation Theory“. Then a little later I was in Erlangen, Germany for one called “Categorical and Representation-Theoretic Methods in Quantum Geometry and CFT“. Despite the similar-sounding titles, these were on fairly different themes, though Marco Mackaay was at both, talking about categorifying the q-Schur algebra by diagrams.  I’ll describe the meetings, but for now I’ll start with the first.  Next post will be a summary of the second.

The Ottawa meeting was organized by Alistair Savage, and Alex Hoffnung (like me, a former student of John Baez). Alistair gave a talk here at IST over the summer about a q-deformation of Khovanov’s categorification of the Heisenberg Algebra I discussed in an earlier entry. A lot of the discussion at the workshop was based on the Khovanov-Lauda program, which began with categorifying quantum version of the classical Lie groups, and is now making lots of progress in the categorification of algebras, representation theory, and so on.

The point of this program is to describe “categorifications” of particular algebras. This means finding monoidal categories with the property that when you take the Grothendieck ring (the ring of isomorphism classes, with a multiplication given by the monoidal structure), you get back the integral form of some algebra. (And then recover the original by taking the tensor over \mathbb{Z} with \mathbb{C}). The key thing is how to represent the algebra by generators and relations. Since free monoidal categories with various sorts of structures can be presented as categories of string diagrams, it shouldn’t be surprising that the categories used tend to have objects that are sequences (i.e. monoidal products) of dots with various sorts of labelling data, and morphisms which are string diagrams that carry those labels on strands (actually, usually they’re linear combinations of such diagrams, so everything is enriched in vector spaces). Then one imposes relations on the “free” data given this way, by saying that the diagrams are considered the same morphism if they agree up to some local moves. The whole problem then is to find the right generators (labelling data) and relations (local moves). The result will be a categorification of a given presentation of the algebra you want.

So for instance, I was interested in Sabin Cautis and Anthony Licata‘s talks connected with this paper, “Heisenberg Categorification And Hilbert Schemes”. This is connected with a generalization of Khovanov’s categorification linked above, to include a variety of other algebras which are given a similar name. The point is that there’s such a “Heisenberg algebra” associated to different subgroups \Gamma \subset SL(2,\mathbf{k}), which in turn are classified by Dynkin diagrams. The vertices of these Dynkin diagrams correspond to some generators of the Heisenberg algebra, and one can modify Khovanov’s categorification by having strands in the diagram calculus be labelled by these vertices. Rules for local moves involving strands with different labels will be governed by the edges of the Dynkin diagram. Their paper goes on to describe how to represent these categorifications on certain categories of Hilbert schemes.

Along the same lines, Aaron Lauda gave a talk on the categorification of the NilHecke algebra. This is defined as a subalgebra of endomorphisms of P_a = \mathbb{Z}[x_1,\dots,x_a], generated by multiplications (by the x_i) and the divided difference operators \partial_i. There are different from the usual derivative operators: in place of the differences between values of a single variable, they measure how a function behaves under the operation s_i which switches variables x_i and x_{i+1} (that is, the reflection in the hyperplane where x_i = x_{i+1}). The point is that just like differentiation, this operator – together with multiplication – generates an algebra in End(\mathbb{Z}[x_1,\dots,x_a]. Aaron described how to categorify this presentation of the NilHecke algebra with a string-diagram calculus.

So anyway, there were a number of talks about the explosion of work within this general program – for instance, Marco Mackaay’s which I mentioned, as well as that of Pedro Vaz about the same project. One aspect of this program is that the relatively free “string diagram categories” are sometimes replaced with categories where the objects are bimodules and morphisms are bimodule homomorphisms. Making the relationship precise is then a matter of proving these satisfy exactly the relations on a “free” category which one wants, but sometimes they’re a good setting to prove one has a nice categorification. Thus, Ben Elias and Geordie Williamson gave two parts of one talk about “Soergel Bimodules and Kazhdan-Lusztig Theory” (see a blog post by Ben Webster which gives a brief intro to this notion, including pointing out that Soergel bimodules give a categorification of the Hecke algebra).

One of the reasons for doing this sort of thing is that one gets invariants for manifolds from algebras – in particular, things like the Jones polynomial, which is related to the Temperley-Lieb algebra. A categorification of it is Khovanov homology (which gives, for a manifold, a complex, with the property that the graded Euler characteristic of the complex is the Jones polynomial). The point here is that categorifying the algebra lets you raise the dimension of the kind of manifold your invariants are defined on.

So, for instance, Scott Morrison described “Invariants of 4-Manifolds from Khonanov Homology“.  This was based on a generalization of the relationship between TQFT’s and planar algebras.  The point is, planar algebras are described by the composition of diagrams of the following form: a big circle, containing some number of small circles.  The boundaries of each circle are labelled by some number of marked points, and the space between carries curves which connect these marked points in some way.  One composes these diagrams by gluing big circles into smaller circles (there’s some further discussion here including a picture, and much more in this book here).  Scott Morrison described these diagrams as “spaghetti and meatball” diagrams.  Planar algebras show up by associating a vector spaces to “the” circle with n marked points, and linear maps to each way (up to isotopy) of filling in edges between such circles.  One can think of the circles and marked-disks as a marked-cobordism category, and so a functorial way of making these assignments is something like a TQFT.  It also gives lots of vector spaces and lots of linear maps that fit together in a particular way described by this category of marked cobordisms, which is what a “planar algebra” actually consists of.  Clearly, these planar algebras can be used to get some manifold invariants – namely the “TQFT” that corresponds to them.

Scott Morrison’s talk described how to get invariants of 4-dimensional manifolds in a similar way by boosting (almost) everything in this story by 2 dimensions.  You start with a 4-ball, whose boundary is a 3-sphere, and excise some number of 4-balls (with 3-sphere boundaries) from the interior.  Then let these 3D boundaries be “marked” with 1-D embedded links (think “knots” if you like).  These 3-spheres with embedded links are the objects in a category.  The morphisms are 4-balls which connect them, containing 2D knotted surfaces which happen to intersect the boundaries exactly at their embedded links.  By analogy with the image of “spaghetti and meatballs”, where the spaghetti is a collection of 1D marked curves, Morrison calls these 4-manifolds with embedded 2D surfaces “lasagna diagrams” (which generalizes to the less evocative case of “(n,k) pasta diagrams”, where we’ve just mentioned the (2,1) and (4,2) cases, with k-dimensional “pasta” embedded in n-dimensional balls).  Then the point is that one can compose these pasta diagrams by gluing the 4-balls along these marked boundaries.  One then gets manifold invariants from these sorts of diagrams by using Khovanov homology, which assigns to

Ben Webster talked about categorification of Lie algebra representations, in a talk called “Categorification, Lie Algebras and Topology“. This is also part of categorifying manifold invariants, since the Reshitikhin-Turaev Invariants are based on some monoidal category, which in this case is the category of representations of some algebra.  Categorifying this to a 2-category gives higher-dimensional equivalents of the RT invariants.  The idea (which you can check out in those slides) is that this comes down to describing the analog of the “highest-weight” representations for some Lie algebra you’ve already categorified.

The Lie theory point here, you might remember, is that representations of Lie algebras \mathfrak{g} can be analyzed by decomposing them into “weight spaces” V_{\lambda}, associated to weights \lambda : \mathfrak{g} \rightarrow \mathbf{k} (where \mathbf{k} is the base field, which we can generally assume is \mathbb{C}).  Weights turn Lie algebra elements into scalars, then.  So weight spaces generalize eigenspaces, in that acting by any element g \in \mathfrak{g} on a “weight vector” v \in V_{\lambda} amounts to multiplying by \lambda{g}.  (So that v is an eigenvector for each g, but the eigenvalue depends on g, and is given by the weight.)  A weight can be the “highest” with respect to a natural order that can be put on weights (\lambda \geq \mu if the difference is a nonnegative combination of simple weights).  Then a “highest weight representation” is one which is generated under the action of \mathfrak{g} by a single weight vector v, the “highest weight vector”.

The point of the categorification is to describe the representation in the same terms.  First, we introduce a special strand (which Ben Webster draws as a red strand) which represents the highest weight vector.  Then we say that the category that stands in for the highest weight representation is just what we get by starting with this red strand, and putting all the various string diagrams of the categorification of \mathfrak{g} next to it.  One can then go on to talk about tensor products of these representations, where objects are found by amalgamating several such diagrams (with several red strands) together.  And so on.  These categorified representations are then supposed to be usable to give higher-dimensional manifold invariants.

Now, the flip side of higher categories that reproduce ordinary representation theory would be the representation theory of higher categories in their natural habitat, so to speak. Presumably there should be a fairly uniform picture where categorifications of normal representation theory will be special cases of this. Vlodymyr Mazorchuk gave an interesting talk called 2-representations of finitary 2-categories.  He gave an example of one of the 2-categories that shows up a lot in these Khovanov-Lauda categorifications, the 2-category of Soergel Bimodules mentioned above.  This has one object, which we can think of as a category of modules over the algebra \mathbb{C}[x_1, \dots, x_n]/I (where I  is some ideal of homogeneous symmetric polynomials).  The morphisms are endofunctors of this category, which all amount to tensoring with certain bimodules – the irreducible ones being the Soergel bimodules.  The point of the talk was to explain the representations of 2-categories \mathcal{C} – that is, 2-functors from \mathcal{C} into some “classical” 2-category.  Examples would be 2-categories like “2-vector spaces”, or variants on it.  The examples he gave: (1) [small fully additive \mathbf{k}-linear categories], (2) the full subcategory of it with finitely many indecomposible elements, (3) [categories equivalent to module categories of finite dimensional associative \mathbf{k}-algebras].  All of these have some claim to be a 2-categorical analog of [vector spaces].  In general, Mazorchuk allowed representations of “FIAT” categories: Finitary (Two-)categories with Involutions and Adjunctions.

Part of the process involved getting a “multisemigroup” from such categories: a set S with an operation which takes pairs of elements, and returns a subset of S, satisfying some natural associativity condition.  (Semigroups are the case where the subset contains just one element – groups are the case where furthermore the operation is invertible).  The idea is that FIAT categories have some set of generators – indecomposable 1-morphisms – and that the multisemigroup describes which indecomposables show up in a composite.  (If we think of the 2-category as a monoidal category, this is like talking about a decomposition of a tensor product of objects).  So, for instance, for the 2-category that comes from the monoidal category of \mathfrak{sl}(2) modules, we get the semigroup of nonnegative integers.  For the Soergel bimodule 2-category, we get the symmetric group.  This sort of thing helps characterize when two objects are equivalent, and in turn helps describe 2-representations up to some equivalence.  (You can find much more detail behind the link above.)

On the more classical representation-theoretic side of things, Joel Kamnitzer gave a talk called “Spiders and Buildings”, which was concerned with some geometric and combinatorial constructions in representation theory.  These involved certain trivalent planar graphs, called “webs”, whose edges carry labels between 1 and (n-1).  They’re embedded in a disk, and the outgoing edges, with labels (k_1, \dots, k_m) determine a representation space for a group G, say G = SL_n, namely the tensor product of a bunch of wedge products, \otimes_j \wedge^{k_j} \mathbb{C}^n, where SL_n acts on \mathbb{C}^n as usual.  Then a web determines an invariant vector in this space.  This comes about by having invariant vectors for each vertex (the basic case where m =3), and tensoring them together.  But the point is to interpret this construction geometrically.  This was a bit outside my grasp, since it involves the Langlands program and the geometric Satake correspondence, neither of which I know much of anything about, but which give geometric/topological ways of constructing representation categories.  One thing I did pick up is that it uses the “Langlands dual group” \check{G} of G to get a certain metric space called Gn_{\check{G}}.  Then there’s a correspondence between the category of representations of G and the category of (perverse, constructible) sheaves on this space.  This correspondence can be used to describe the vectors that come out of these webs.

Jim Dolan gave a couple of talks while I was there, which actually fit together as two parts of a bigger picture – one was during the workshop itself, and one at the logic seminar on the following Monday. It helped a lot to see both in order to appreciate the overall point, so I’ll mix them a bit indiscriminately. The first was called “Dimensional Analysis is Algebraic Geometry”, and the second “Toposes of Quasicoherent Sheaves on Toric Varieties”. For the purposes of the logic seminar, he gave the slogan of the second talk as “Algebraic Geometry is a branch of Categorical Logic”. Jim’s basic idea was inspired by Bill Lawvere’s concept of a “theory”, which is supposed to extend both “algebraic theories” (such as the “theory of groups”) and theories in the sense of physics.  Any given theory is some structured category, and “models” of the theory are functors into some other category to represent it – it thus has a functor category called its “moduli stack of models”.  A physical theory (essentially, models which depict some contents of the universe) has some parameters.  The “theory of elastic scattering”, for instance, has the masses, and initial and final momenta, of two objects which collide and “scatter” off each other.  The moduli space for this theory amounts to assignments of values to these parameters, which must satisfy some algebraic equations – conservation of energy and momentum (for example, \sum_i m_i v_i^{in} = \sum_i m_i v_i^{out}, where i \in 1, 2).  So the moduli space is some projective algebraic variety.  Jim explained how “dimensional analysis” in physics is the study of line bundles over such varieties (“dimensions” are just such line bundles, since a “dimension” is a 1-dimensional sort of thing, and “quantities” in those dimensions are sections of the line bundles).  Then there’s a category of such bundles, which are organized into a special sort of symmetric monoidal category – in fact, it’s contrained so much it’s just a graded commutative algebra.

In his second talk, he generalized this to talk about categories of sheaves on some varieties – and, since he was talking in the categorical logic seminar, he proposed a point of view for looking at algebraic geometry in the context of logic.  This view could be summarized as: Every (generalized) space studied by algebraic geometry “is” the moduli space of models for some theory in some doctrine.  The term “doctrine” is Bill Lawvere’s, and specifies what kind of structured category the theory and the target of its models are supposed to be (and of course what kind of functors are allowed as models).  Thus, for instance, toposes (as generalized spaces) are supposed to be thought of as “geometric theories”.  He explained that his “dimensional analysis doctrine” is a special case of this.  As usual when talking to Jim, I came away with the sense that there’s a very large program of ideas lurking behind everything he said, of which only the tip of the iceberg actually made it into the talks.

Next post, when I have time, will talk about the meeting at Erlangen…

So Dan Christensen, who used to be my supervisor while I was a postdoc at the University of Western Ontario, came to Lisbon last week and gave a talk about a topic I remember hearing about while I was there.  This is the category Diff of diffeological spaces as a setting for homotopy theory.  Just to make things scan more nicely, I’m going to say “smooth space” for “diffeological space” here, although this term is in fact ambiguous (see Andrew Stacey’s “Comparative Smootheology” for lots of details about options).  There’s a lot of information about Diff in Patrick Iglesias-Zimmour’s draft-of-a-book.


The point of the category Diff, initially, is that it extends the category of manifolds while having some nicer properties.  Thus, while all manifolds are smooth spaces, there are others, which allow Diff to be closed under various operations.  These would include taking limits and colimits: for instance, any subset of a smooth space becomes a smooth space, and any quotient of a smooth space by an equivalence relation is a smooth space.  Then too, Diff has exponentials (that is, if A and B are smooth spaces, so is A^B = Hom(B,A)).

So, for instance, this is a good context for constructing loop spaces: a manifold M is a smooth space, and so is its loop space LM = M^{S^1} = Hom(S^1,M), the space of all maps of the circle into M.  This becomes important for talking about things like higher cohomology, gerbes, etc.  When starting with the category of manifolds, doing this requires you to go off and define infinite dimensional manifolds before LM can even be defined.  Likewise, the irrational torus is hard to talk about as a manifold: you take a torus, thought of as \mathbb{R}^2 / \mathbb{Z}^2.  Then take a direction in \mathbb{R}^2 with irrational slope, and identify any two points which are translates of each other in \mathbb{R}^2 along the direction of this line.  The orbit of any point is then dense in the torus, so this is a very nasty space, certainly not a manifold.  But it’s a perfectly good smooth space.

Well, these examples motivate the kinds of things these nice categorical properties allow us to do, but Diff wouldn’t deserve to be called a category of “smooth spaces” (Souriau’s original name for them) if they didn’t allow a notion of smooth maps, which is the basis for most of what we do with manifolds: smooth paths, derivatives of curves, vector fields, differential forms, smooth cohomology, smooth bundles, and the rest of the apparatus of differential geometry.  As with manifolds, this notion of smooth map ought to get along with the usual notion for \mathbb{R}^n in some sense.

Smooth Spaces

Thus, a smooth (i.e. diffeological) space consists of:

  • A set X (of “points”)
  • A set \{ f : U \rightarrow X \} (of “plots”) for every n and open U \subset \mathbb{R}^n such that:
  1. All constant maps are plots
  2. If f: U \rightarrow X is a plot, and g : V \rightarrow U is a smooth map, f \circ g : V \rightarrow X is a plot
  3. If \{ g_i : U_i \rightarrow U\} is an open cover of U, and f : U \rightarrow X is a map, whose restrictions f \circ g_i : U_i \rightarrow X are all plots, so is f

A smooth map between smooth spaces is one that gets along with all this structure (i.e. the composite with every plot is also a plot).

These conditions mean that smooth maps agree with the usual notion in \mathbb{R}^n, and we can glue together smooth spaces to produce new ones.  A manifold becomes a smooth space by taking all the usual smooth maps to be plots: it’s a full subcategory (we introduce new objects which aren’t manifolds, but no new morphisms between manifolds).  A choice of a set of plots for some space X is a “diffeology”: there can, of course, be many different diffeologies on a given space.

So, in particular, diffeologies can encode a little more than the charts of a manifold.  Just for one example, a diffeology can have “stop signs”, as Dan put it – points with the property that any smooth map from I= [0,1] which passes through them must stop at that point (have derivative zero – or higher derivatives, if you like).  Along the same lines, there’s a nonstandard diffeology on I itself with the property that any smooth map from this I into a manifold M must have all derivatives zero at the endpoints.  This is a better object for defining smooth fundamental groups: you can concatenate these paths at will and they’re guaranteed to be smooth.

As a Quasitopos

An important fact about these smooth spaces is that they are concrete sheaves (i.e. sheaves with underlying sets) on the concrete site (i.e. a Grothendieck site where objects have underlying sets) whose objects are the U \subset \mathbb{R}^n.  This implies many nice things about the category Diff.  One is that it’s a quasitopos.  This is almost the same as a topos (in particular, it has limits, colimits, etc. as described above), but where a topos has a “subobject classifier”, a quasitopos has a weak subobject classifier (which, perhaps confusingly, is “weak” because it only classifies the strong subobjects).

So remember that a subobject classifier is an object with a map t : 1 \rightarrow \Omega from the terminal object, so that any monomorphism (subobject) A \rightarrow X is the pullback of t along some map X \rightarrow \Omega (the classifying map).  In the topos of sets, this is just the inclusion of a one-element set \{\star\} into a two-element set \{T,F\}: the classifying map for a subset A \subset X sends everything in A (i.e. in the image of the inclusion map) to T = Im(t), and everything else to F.  (That is, it’s the characteristic function.)  So pulling back T

Any topos has one of these – in particular the topos of sheaves on the diffeological site has one.  But Diff consists of the concrete sheaves, not all sheaves.  The subobject classifier of the topos won’t be concrete – but it does have a “concretification”, which turns out to be the weak subobject classifier.  The subobjects of a smooth space X which it classifies (i.e. for which there’s a classifying map as above) are exactly the subsets A \subset X equipped with the subspace diffeology.  (Which is defined in the obvious way: the plots are the plots of X which land in A).

We’ll come back to this quasitopos shortly.  The main point is that Dan and his graduate student, Enxin Wu, have been trying to define a different kind of structure on Diff.  We know it’s good for doing differential geometry.  The hope is that it’s also good for doing homotopy theory.

As a Model Category

The basic idea here is pretty well supported: naively, one can do a lot of the things done in homotopy theory in Diff: to start with, one can define the “smooth homotopy groups” \pi_n^s(X;x_0) of a pointed space.  It’s a theorem by Dan and Enxin that several possible ways of doing this are equivalent.  But, for example, Iglesias-Zimmour defines them inductively, so that \pi_0^s(X) is the set of path-components of X, and \pi_k^s(X) = \pi_{k-1}^s(LX) is defined recursively using loop spaces, mentioned above.  The point is that this all works in Diff much as for topological spaces.

In particular, there are analogs for the \pi_k^s for standard theorems like the long exact sequence of homotopy groups for a bundle.  Of course, you have to define “bundle” in Diff – it’s a smooth surjective map X \rightarrow Y, but saying a diffeological bundle is “locally trivial” doesn’t mean “over open neighborhoods”, but “under pullback along any plot”.  (Either of these converts a bundle over a whole space into a bundle over part of \mathbb{R}^n, where things are easy to define).

Less naively, the kind of category where homotopy theory works is a model category (see also here).  So the project Dan and Enxin have been working on is to give Diff this sort of structure.  While there are technicalities behind those links, the essential point is that this means you have a closed category (i.e. with all limits and colimits, which Diff does), on which you’ve defined three classes of morphisms: fibrations, cofibrations, and weak equivalences.  These are supposed to abstract the properties of maps in the homotopy theory of topological spaces – in that case weak equivalences being maps that induce isomorphisms of homotopy groups, the other two being defined by having some lifting properties (i.e. you can lift a homotopy, such as a path, along a fibration).

So to abstract the situation in Top, these classes have to satisfy some axioms (including an abstract form of the lifting properties).  There are slightly different formulations, but for instance, the “2 of 3” axiom says that if two of f, latex $g$ and f \circ g are weak equivalences, so is the third.  Or, again, there should be a factorization for any morphism into a fibration and an acyclic cofibration (i.e. one which is also a weak equivalence), and also vice versa (that is, moving the adjective “acyclic” to the fibration).  Defining some classes of maps isn’t hard, but it tends to be that proving they satisfy all the axioms IS hard.

Supposing you could do it, though, you have things like the homotopy category (where you formally allow all weak equivalences to have inverses), derived functors(which come from a situation where homotopy theory is “modelled” by categories of chain complexes), and various other fairly powerful tools.  Doing this in Diff would make it possible to use these things in a setting that supports differential geometry.  In particular, you’d have a lot of high-powered machinery that you could apply to prove things about manifolds, even though it doesn’t work in the category Man itself – only in the larger setting Diff.

Dan and Enxin are still working on nailing down some of the proofs, but it appears to be working.  Their strategy is based on the principle that, for purposes of homotopy, topological spaces act like simplicial complexes.  So they define an affine “simplex”, \mathbb{A}^n = \{ (x_0, x_1, \dots, x_n) \in \mathbb{R}^{n+1} | \sum x_i = 1 \}.  These aren’t literally simplexes: they’re affine planes, which we understand as smooth spaces – with the subspace diffeology from \mathbb{R}^{n+1}.  But they behave like simplexes: there are face and degeneracy maps for them, and the like.  They form a “cosimplicial object”, which we can think of as a functor \Delta \rightarrow Diff, where \Delta is the simplex category).

Then the point is one can look at, for a smooth space X, the smooth singular simplicial set S(X): it’s a simplicial set where the sets are sets of smooth maps from the affine simplex into X.  Likewise, for a simplicial set S, there’s a smooth space, the “geometric realization” |S|.  These give two functors |\cdot | and S, which are adjoints (| \cdot | is the left adjoint).  And then, weak equivalences and fibrations being defined in simplicial sets (w.e. are homotopy equivalences of the realization in Top, and fibrations are “Kan fibrations”), you can just pull the definition back to Diff: a smooth map is a w.e. if its image under S is one.  The cofibrations get indirectly defined via the lifting properties they need to have relative to the other two classes.

So it’s still not completely settled that this definition actually gives a model category structure, but it’s pretty close.  Certainly, some things are known.  For instance, Enxin Wu showed that if you have a fibrant object X (i.e. one where the unique map to the terminal object is a fibration – these are generally the “good” objects to define homotopy groups on), then the smooth homotopy groups agree with the simplicial ones for S(X).  This implies that for these objects, the weak equivalences are exactly the smooth maps that give isomorphisms for homotopy groups.  And so forth.  But notice that even some fairly nice objects aren’t fibrant: two lines glued together at a point isn’t, for instance.

There are various further results.  One, a consquences of a result Enxin proved, is that all manifolds are fibrant objects, where these nice properties apply.  It’s interesting that this comes from the fact that, in Diff, every (connected) manifold is a homogeneous space.  These are quotients of smooth groups, G/H – the space is a space of cosets, and H is understood to be the stabilizer of the point.  Usually one thinks of homogenous spaces as fairly rigid things: the Euclidean plane, say, where G is the whole Euclidean group, and H the rotations; or a sphere, where G is all n-dimensional rotations, and H the ones that fix some point on the sphere.  (Actually, this gives a projective plane, since opposite points on the sphere get identified.  But you get the idea).  But that’s for Lie groups.  The point is that G = Diff(M,M), the space of diffeomorphisms from M to itself, is a perfectly good smooth group.  Then the subgroup H of diffeomorphisms that fix any point is a fine smooth subgroup, and G/H is a homogeneous space in Diff.  But that’s just M, with G acting transitively on it – any point can be taken anywhere on M.

Cohesive Infinity-Toposes

One further thing I’d mention here is related to a related but more abstract approach to the question of how to incorporate homotopy-theoretic tools with a setting that supports differential geometry.  This is the notion of a cohesive topos, and more generally of a cohesive infinity-topos.  Urs Schreiber has advocated for this approach, for instance.  It doesn’t really conflict with the kind of thing Dan was talking about, but it gives a setting for it with lot of abstract machinery.  I won’t try to explain the details (which anyway I’m not familiar with), but just enough to suggest how the two seem to me to fit together, after discussing it a bit with Dan.

The idea of a cohesive topos seems to start with Bill Lawvere, and it’s supposed to characterize something about those categories which are really “categories of spaces” the way Top is.  Intuitively, spaces consist of “points”, which are held together in lumps we could call “pieces”.  Hence “cohesion”: the points of a typical space cohere together, rather than being a dust of separate elements.  When that happens, in a discrete space, we just say that each piece happens to have just one point in it – but a priori we distinguish the two ideas.  So we might normally say that Top has an “underlying set” functor U : Top \rightarrow Set, and its left adjoint, the “discrete space” functor Disc: Set \rightarrow Top (left adjoint since set maps from S are the same as continuous maps from Disc(S) – it’s easy for maps out of Disc(S) to be continuous, since every subset is open).

In fact, any topos of sheaves on some site has a pair of functors like this (where U becomes \Gamma, the “set of global sections” functor), essentially because Set is the topos of sheaves on a single point, and there’s a terminal map from any site into the point.  So this adjoint pair is the “terminal geometric morphism” into Set.

But this omits there are a couple of other things that apply to Top: U has a right adjoint, Codisc: Set \rightarrow Top, where Codisc(S) has only S and \emptyset as its open sets.  In Codisc(S), all the points are “stuck together” in one piece.  On the other hand, Disc itself has a left adjoint, \Pi_0: Top \rightarrow Set, which gives the set of connected components of a space.  \Pi_0(X) is another kind of “underlying set” of a space.  So we call a topos \mathcal{E} “cohesive” when the terminal geometric morphism extends to a chain of four adjoint functors in just this way, which satisfy a few properties that characterize what’s happening here.  (We can talk about “cohesive sites”, where this happens.)

Now Diff isn’t exactly a category of sheaves on a site: it’s the category of concrete sheaves on a (concrete) site.  There is a cohesive topos of all sheaves on the diffeological site.  (What’s more, it’s known to have a model category structure).  But now, it’s a fact that any cohesive topos \mathcal{E} has a subcategory of concrete objects (ones where the canonical unit map X \rightarrow Codisc(\Gamma(X)) is mono: roughly, we can characterize the morphisms of X by what they do to its points).  This category is always a quasitopos (and it’s a reflective subcategory of \mathcal{E}: see the previous post for some comments about reflective subcategories if interested…)  This is where Diff fits in here.  Diffeologies define a “cohesion” just as topologies do: points are in the same “piece” if there’s some plot from a connected part of \mathbb{R}^n that lands on both.  Why is Diff only a quasitopos?  Because in general, the subobject classifier in \mathcal{E} isn’t concrete – but it will have a “concretification”, which is the weak subobject classifier I mentioned above.

Where the “infinity” part of “infinity-topos” comes in is the connection to homotopy theory.  Here, we replace the topos Sets with the infinity-topos of infinity-groupoids.  Then the “underlying” functor captures not just the set of points of a space X, but its whole fundamental infinity-groupoid.  Its objects are points of X, its morphisms are paths, 2-morphisms are homotopies of paths, and so on.  All the homotopy groups of X live here.  So a cohesive inifinity-topos is defined much like above, but with \infty-Gpd playing the role of Set, and with that \Pi_0 functor replaced by \Pi, something which, implicitly, gives all the homotopy groups of X.  We might look for cohesive infinity-toposes to be given by the (infinity)-categories of simplicial sheaves on cohesive sites.

This raises a point Dan made in his talk over the diffeological site D, we can talk about a cube of different structures that live over it, starting with presheaves: PSh(D).  We can add different modifiers to this: the sheaf condition; the adjective “concrete”; the adjective “simplicial”.  Various combinations of these adjectives (e.g. simplicial presheaves) are known to have a model structure.  Diff is the case where we have concrete sheaves on D.  So far, it hasn’t been proved, but it looks like it shortly will be, that this has a model structure.  This is a particularly nice one, because these things really do seem a lot like spaces: they’re just sets with some easy-to-define and well-behaved (that’s what the sheaf condition does) structure on them, and they include all the examples a differential geometer requires, the manifolds.

So I recently got back from a trip to the UK – most of the time was spent in Cardiff, at a workshop on TQFT and categorification at the University of Cardiff.  There were two days of talks, which had a fair amount of overlap with our workshop in Lisbon, so, being a little worn out on the topic, I’ll refrain from summarizing them all, except to mention a really nice one by Jeff Giansiracusa (who hadn’t been in Lisbon) which related (open/closed) TQFT’s and cohomology theories via a discussion of how categories of cobordisms with various kinds of structure correspond to various sorts of operads.  For example, the “little disks” operad, which describes the structure of how to compose disks with little holes, by pasting new disks into the holes of the old ones, corresponds to the usual cobordism category.

This workshop was part of a semester-long program they’ve been having, sponsored by an EU network on noncommutative geometry.  After the workshop was done, Tim Porter and I stayed on for the rest of the week to give some informal seminars and talk to the various grad students who were visiting at the time.  The seminars started off being directed by questions, but ended up talking about TQFT’s and their relations to various kinds of algebras and higher categorical structures, via classifying spaces.  We also had some interesting discussions outside these, for example with Jennifer Maier, who’s been working with Thomas Nicklaus on equivariant Dijkgraaf-Witten theory; with Grace Kennedy, about planar algebras and their relationships to TQFT‘s. I’d also like to give some credit to Makoto Yamashita, who’s interested in noncommutative geometry (viz) and pointed out to me a paper of Alain Connes which gives an account of integration on groupoids, and what corresponds to measures in that setting, which thankfully agrees with what little of it I’d been able to work out on my own.

However, what I’d like to take the time to write up was from the earlier part of my trip, where I visited with Jamie Vicary at Oxford. While I was there, I gave a little lunch seminar about the bicategory Span(Gpd) (actually a tricategory), and some of the physics- and TQFT-related uses for it. That turned out to be very apropos, because they also had another visitor at the same time, namely Jean Benabou, the fellow who invented bicategories, and introduced the idea of bicategories of spans as one of the first examples.  He gave a talk while I was there which was about the relationship between spans and what he calls “distributors” (which are often called “profunctors“, but since he was anyway the one who introduced them and gave them that name in the first place, and since he has since decided that “profunctors” should refer to only a special class of these entities, I’ll follow his terminology).

(Edit: Thanks to Thomas Streicher for passing on a reference to lecture notes he prepared from lecture by Benabou on the same general topic.)

The question to answer is: what is the relation between spans of categories and distributors?

This is related to a slightly lower-grade question about the relationship between spans of sets, and relations, although the answer turns out to be more complicated.  So, remember that a span from a set A to a set B is just a diagram like this: A \leftarrow X \rightarrow B.  They can be composed together – so that given a span from A to B, and from B to C, we can take fibre products over B and get a span from A to C, consisting of pairs of elements from the X sets which map down to the same b \in B.  We can do the same thing in any category with pullbacks, not just {Sets}.

A span A \leftarrow S \rightarrow B is a relation if the pair of arrows is “jointly monic”, which is to say that as a map S \rightarrow A \times B, it is a monomorphism – which, since we’re talking about sets, essentially means “a subset”.  That is, up to isomorphism of spans, S just picks out a bunch of pairs (a,b) \in A \times B, which are the “related” pairs in this relation.  So there is an inclusion {Rel} \hookrightarrow Span({Sets}).  What’s more  the inclusion has a left adjoin, which turns a span into a corresponding relation.  It follows from the fact that Sets has an “epi-mono factorization”: namely, the map f: S \rightarrow A \times B that comes from the span (and the definition of product) will factor through the image.  That is, it is the composite S \rightarrow Im(f) \rightarrow A \times B, where the first part is surjective, and the second part is injective.  Then the inclusion r(f) : Im(f) \hookrightarrow A \times B is a relation.  So we say the inclusion of Rel into Span(Set) is a reflection.  (This is a slightly misleading term: there’s an adjoint to the inclusion, but it’s not an adjoint equivalence.  “Reflecting” twice may not get you back where you started, or anywhere isomorphic to it.)

(Edit: Actually, this is a bit wrong.  See the comments below.  What’s true is that the hom-categories of Rel have reflective inclusions into the hom-categories of Span(Set).  Here, we think of Rel as a 2-category because it’s naturally enriched in posets.  Then these reflective inclusions of hom-categories can be used to build  a lax functor from Span(Set) to Rel – but not an actual functor.)

So a slightly more general question is: if \mathbb{V} is a monoidal category, and \mathbb{V}' \subset \mathbb{V} is a  “reflective subcategory“, can we make \mathbb{V}' into a monoidal category just by defining A' \otimes ' B' (the product in \mathbb{V}') to be the reflection r(A' \otimes B') of the original product?   This is the one-object version of a question about bicategories.  Namely, say that \mathbb{S} is a bicategory, and \mathbb{S}' is a sub-bicategory such that every pair of objects gives a reflective subcategory: \mathbb{S}' (A,B) \subset \mathbb{S}(A,B) has a reflection.  Then can we “pull” the composition of morphisms in \mathbb{S} back to \mathbb{S}'?

The answer is no: this just doesn’t work in general.  For spans of sets, and relations, it works: composing spans essentially “counts paths” which relate elements A and B, whereas composing relations only keeps track of whether or not there is a path.  However, composing spans which come from relations, and then squashing them back down to relations again, agrees with the composite in Rel (the squashing just tells whether the set of paths from A to B by a sequence of relations is empty or not).  But in the case of Span(Cat) and some reflective subcategory – among other possible examples – associativity and unit axioms will break, unless the reflections r_{A,B} are specially tuned.  This isn’t to say that we can’t make \mathbb{V}' a monoidal category (or \mathbb{S}' a bicategory).  It just means that pulling back \otimes or \circ along the reflection won’t work.  But there is a theorem that says we can always promote such an inclusion into one where this works.

So what’s an instance of all this?  A distributor (again, often called “profunctor”) \Phi : \mathbb{A} \nrightarrow \mathbb{B} from a category \mathbb{A} to \mathbb{B} is actually a functor \phi : \mathbb{B}^{op} \times \mathbb{A} \rightarrow Sets.  Then there’s a bicategory Dist, where for each objects there’s a category Dist(\mathbb{A},\mathbb{B}).  Distributors represent, in some sense, a categorification of relations. (This observation follows the periodic table of category theory, in which a 1-category is a category, a 0-category is a set, and a (-1)-category is a truth value.  There’s a 1-category of relations, with hom-sets Rel(A,B), and each one is a map from B \times A into truth values, specifying whether a pair (b,a) is related.)

The most elementary example of a distributor is the “hom-set” construction, where \Phi (\mathbb{A},\mathbb{B}) = hom(\mathbb{A},\mathbb{B}), which is indeed covariant in \mathbb{A} and contravariant in \mathbb{B}.  A way to see the general case in that \Phi obviously determines a functor from \mathbb{A} into presheaves on \mathbb{B}: \Phi : \mathbb{A} \rightarrow \hat{\mathbb{B}}, where \hat{\mathbb{B}} = Psh(\mathbb{B}) is the category hom(\mathbb{B},Sets).

In fact, given a functor F : \mathbb{A} \rightarrow \mathbb{B}, we can define two different distributors:

\Phi^F : \mathbb{B} \nrightarrow \mathbb{A} with \Phi^F(A,B) = Hom_{\mathbb{B}}(FA,B)


\Phi_F : \mathbb{A} \nrightarrow \mathbb{B} with \Phi_F(B,A) = Hom_{\mathbb{B}}(B,FA)

(Remember, these \Phi are functors from the product into Sets: so they are just taking hom-sets here in \mathbb{B} in one direction or the other.)  This much is a tautology: putting a value in \mathbb{A} in leaves a free variable, but the point is that \hat{\mathbb{B}} can be interpreted as a category of “big objects of \mathbb{B}“.  This is since the Yoneda embedding Y : B \hookrightarrow \mathbb{B} embeds \mathbb{B} by taking each object b \in B to the presentable presheaf hom_B(-,b) which assigns each object the set of morphisms into b, so \hat{\mathbb{B}} has “extended” objects of \mathbb{B}.

So distributors like \Phi are “generalized functors” into \mathbb{B} – and the idea is that this is in roughly the same way that “distributions” are to be seen as “generalized functions”, hence the name.  (Benabou now prefers to use the name “profunctor” to refer only to those distributors which map to “pro-objects” in \hat{\mathbb{B}}, which are just special presheaves, namely the “flat” ones.)

Now we have an idea that there is a bicategory Dist, whose hom-categories Dist(\mathbb{A},\mathbb{B}) consist of distributors (and natural transformations), and that the usual functors (which can be seen as distributors which only happen to land in the image of \mathbb{B} under the Yoneda embedding) form a sub-bicategory: that is, post-composition with Y turns a functor into a distributor.

But moreover, this operation has an adjoint: functors out of \mathbb{B} can be “lifted” to functors out of \hat{\mathbb{B}}, just by taking the Kan extension of a functor G : \mathbb{B} \rightarrow \mathbb{X} along Y.  This will work (pointwise, even), as long as \mathbb{X} is cocomplete, so that we can basically “add up” contributions from the objects of \mathbb{B} by taking colimits.  In the special case where \mathbb{X} = \hat{\mathbb{C}} for some other category \mathbb{C}, then this tells us how to get composition of distributors Dist(\mathbb{A},\mathbb{B}) \times Dist(\mathbb{B},\mathbb{C})\rightarrow Dist(\mathbb{A},\mathbb{C}).

Now, for a functor F, there are straightforward unit and counit natural transformations which makes \Phi^F (the image of F under the embedding of Cat into Dist) a left adjoint for \Phi_F.  So we’ve embedded Cat into Dist in such a way that every functor has a right adjoint.  What about Span(Cat)?  In general, given a bicategory B, we can construe Span(B) as a tricategory, which contains B, in such a way that every morphism of B has an ambidextrous adjoint (both left and right adjoint).  (There’s work on this by Toby Kenney and Dorette Pronk, and Alex Hoffnung has also been looking at this recently.)  So how does Span(Cat) relate to Dist?

One statement is that a distributor \Phi : \mathbb{A} \nrightarrow \mathbb{B} can be seen as a special kind of span, namely:

\mathbb{A} \stackrel{q}{\longleftarrow} Elt(\Phi) \stackrel{p}{\longrightarrow} \mathbb{B}

where Elt(\Phi) consists of all the “elements of \Phi” (in particular, pasting together all the images in Sets of pairs (A,B) and the set maps that come from morphisms between them in \mathbb{B}^{op} \times \mathbb{A}).  (As an aside: Benabou also explained how a cospan, \mathbb{A} \rightarrow C(\Phi) \leftarrow \mathbb{B} can be got from a distributor.  The objects of C(\Phi) are just the disjoint union of those from \mathbb{A} and \mathbb{B}, and the hom-sets are just taken from either \mathbb{A}, or \mathbb{B}, or as the sets given by \Phi, depending on the situation.  Then the span we just described completes a pullback square opposite this cospan – it’s a comma category.)

These spans (Elt(\Phi),p,q) end up having some special properties that result from how they’re constructed.  In particular, p will be an op-fibration and q will be a fibration (this, for instance, is alifting property that let one lift morphisms – since the morphisms are found as the images of the original distributor, this makes sense).  Also, the fibres of (p,q) are discrete (these are by definition the images of identity morphisms, so naturally they’re discrete categories).  Finally, these properties (fibration, op-fibration, and discrete fibres) are enough to guarantee that a given span is (isomorphic to) one that comes from a distributor.  So we have an embedding i : Dist \rightarrow Span(Cat).

What’s more, it’s a reflective embedding, because we can always mangle any span to get a new one where these properties hold: it’s enough to force fibres to be discrete by taking their \pi_0 – the connected components.  The other properties will then follow.  But notice that this is a very nontrivial thing to do: in general, the fibres of (p,q) could be any sort of category, and this operation turns them into sets (of isomorphism classes).  So there’s an adjunction between i and \pi_0, and Dist is a reflective sub-bicategory of Span(Cat).  But the severity of \pi_0 ends up meaning that this doesn’t get along well with composition – the composition of distributors (described above) is not related to composition of spans (which works by weak pullback) via this reflection in a naive way.  However, the theorem mentioned above means that there will be SOME reflecction that makes the compositions get along.  It just may not be as nice as this one.

This is kind of surprising, and the ideal punchline to go here would be to say what that reflection is like, but I don’t know the answer to that question just now.  Anyone else know?

Thanks to Bob Coecke, here are some pictures of me, a few of the people from ComLab, and Jean Benabou at dinner at the Oxford University Club, with a variety of dopey expressions as Bob snapped the pictures unexpectedly.  Thanks Bob.

One talk at the workshop was nominally a school talk by Laurent Freidel, but it’s interesting and distinctive enough in its own right that I wanted to consider it by itself.  It was based on this paper on the “Principle of Relative Locality”. This isn’t so much a new theory, as an exposition of what ought to happen when one looks at a particular limit of any putative theory that has both quantum field theory and gravity as (different) limits of it. This leads through some ideas, such as curved momentum space, which have been kicking around for a while. The end result is a way of accounting for apparently non-local interactions of particles, by saying that while the particles themselves “see” the interactions as local, distant observers might not.

Whereas Einstein’s gravity describes a regime where Newton’s gravitational constant G_N is important but Planck’s constant \hbar is negligible, and (special-relativistic) quantum field theory assumes \hbar significant but G_N not.  Both of these assume there is a special velocity scale, given by the speed of light c, whereas classical mechanics assumes that all three can be neglected (i.e. G_N and \hbar are zero, and c is infinite).   The guiding assumption is that these are all approximations to some more fundamental theory, called “quantum gravity” just because it accepts that both G_N and \hbar (as well as c) are significant in calculating physical effects.  So GR and QFT incorporate two of the three constants each, and classical mechanics incorporates neither.  The “principle of relative locality” arises when we consider a slightly different approximation to this underlying theory.

This approximation works with a regime where G_N and \hbar are each negligible, but the ratio is not – this being related to the Planck mass m_p \sim  \sqrt{\frac{\hbar}{G_N}}.  The point is that this is an approximation with no special length scale (“Planck length”), but instead a special energy scale (“Planck mass”) which has to be preserved.   Since energy and momentum are different parts of a single 4-vector, this is also a momentum scale; we expect to see some kind of deformation of momentum space, at least for momenta that are bigger than this scale.  The existence of this scale turns out to mean that momenta don’t add linearly – at least, not unless they’re very small compared to the Planck scale.

So what is “Relative Locality”?  In the paper linked above, it’s stated like so:

Physics takes place in phase space and there is no invariant global projection that gives a description of processes in spacetime.  From their measurements local observers can construct descriptions of particles moving and interacting in a spacetime, but different observers construct different spacetimes, which are observer-dependent slices of phase space.


This arises from taking the basic insight of general relativity – the requirement that physical principles should be invariant under coordinate transformations (i.e. diffeomorphisms) – and extend it so that instead of applying just to spacetime, it applies to the whole of phase space.  Phase space (which, in this limit where \hbar = 0, replaces the Hilbert space of a truly quantum theory) is the space of position-momentum configurations (of things small enough to treat as point-like, in a given fixed approximation).  Having no G_N means we don’t need to worry about any dynamical curvature of “spacetime” (which doesn’t exist), and having no Planck length means we can blithely treat phase space as a manifold with coordinates valued in the real line (which has no special scale).  Yet, having a special mass/momentum scale says we should see some purely combined “quantum gravity” effects show up.

The physical idea is that phase space is an accurate description of what we can see and measure locally.  Observers (whom we assume small enough to be considered point-like) can measure their own proper time (they “have a clock”) and can detect momenta (by letting things collide with them and measuring the energy transferred locally and its direction).  That is, we “see colors and angles” (i.e. photon energies and differences of direction).  Beyond this, one shouldn’t impose any particular theory of what momenta do: we can observe the momenta of separate objects and see what results when they interact and deduce rules from that.  As an extension of standard physics, this model is pretty conservative.  Now, conventionally, phase space would be the cotangent bundle of spacetime T^*M.  This model is based on the assumption that objects can be at any point, and wherever they are, their space of possible momenta is a vector space.  Being a bundle, with a global projection onto M (taking (x,v) to x), is exactly what this principle says doesn’t necessarily obtain.  We still assume that phase space will be some symplectic manifold.   But we don’t assume a priori that momentum coordinates give a projection whose fibres happen to be vector spaces, as in a cotangent bundle.

Now, a symplectic manifold  still looks locally like a cotangent bundle (Darboux’s theorem). So even if there is no universal “spacetime”, each observer can still locally construct a version of “spacetime”  by slicing up phase space into position and momentum coordinates.  One can, by brute force, extend the spacetime coordinates quite far, to distant points in phase space.  This is roughly analogous to how, in special relativity, each observer can put their own coordinates on spacetime and arrive at different notions of simultaneity.  In general relativity, there are issues with trying to extend this concept globally, but it can be done under some conditions, giving the idea of “space-like slices” of spacetime.  In the same way, we can construct “spacetime-like slices” of phase space.

Geometrizing Algebra

Now, if phase space is a cotangent bundle, momenta can be added (the fibres of the bundle are vector spaces).  Some more recent ideas about “quasi-Hamiltonian spaces” (initially introduced by Alekseev, Malkin and Meinrenken) conceive of momenta as “group-valued” – rather than taking values in the dual of some Lie algebra (the way, classically, momenta are dual to velocities, which live in the Lie algebra of infinitesimal translations).  For small momenta, these are hard to distinguish, so even group-valued momenta might look linear, but the premise is that we ought to discover this by experiment, not assumption.  We certainly can detect “zero momentum” and for physical reasons can say that given two things with two momenta (p,q), there’s a way of combining them into a combined momentum p \oplus q.  Think of doing this physically – transfer all momentum from one particle to another, as seen by a given observer.  Since the same momentum at the observer’s position can be either coming in or going out, this operation has a “negative” with (\ominus p) \oplus p = 0.

We do have a space of momenta at any given observer’s location – the total of all momenta that can be observed there, and this space now has some algebraic structure.  But we have no reason to assume up front that \oplus is either commutative or associative (let alone that it makes momentum space at a given observer’s location into a vector space).  One can interpret this algebraic structure as giving some geometry.  The commutator for \oplus gives a metric on momentum space.  This is a bilinear form which is implicitly defined by the “norm” that assigns a kinetic energy to a particle with a given momentum. The associator given by p \oplus ( q \oplus r ) - (p \oplus q ) \oplus r), infinitesimally near 0 where this makes sense, gives a connection.  This defines a “parallel transport” of a finite momentum p in the direction of a momentum q by saying infinitesimally what happens when adding dq to p.

Various additional physical assumptions – like the momentum-space “duals” of the equivalence principle (that the combination of momenta works the same way for all kinds of matter regardless of charge), or the strong equivalence principle (that inertial mass and rest mass energy per the relation E = mc^2 are the same) and so forth can narrow down the geometry of this metric and connection.  Typically we’ll find that it needs to be Lorentzian.  With strong enough symmetry assumptions, it must be flat, so that momentum space is a vector space after all – but even with fairly strong assumptions, as with general relativity, there’s still room for this “empty space” to have some intrinsic curvature, in the form of a momentum-space “dual cosmological constant”, which can be positive (so momentum space is closed like a sphere), zero (the vector space case we usually assume) or negative (so momentum space is hyperbolic).

This geometrization of what had been algebraic is somewhat analogous to what happened with velocities (i.e. vectors in spacetime)) when the theory of special relativity came along.  Insisting that the “invariant” scale c be the same in every reference system meant that the addition of velocities ceased to be linear.  At least, it did if you assume that adding velocities has an interpretation along the lines of: “first, from rest, add velocity v to your motion; then, from that reference frame, add velocity w”.  While adding spacetime vectors still worked the same way, one had to rephrase this rule if we think of adding velocities as observed within a given reference frame – this became v \oplus w = (v + w) (1 + uv) (scaling so c =1 and assuming the velocities are in the same direction).  When velocities are small relative to c, this looks roughly like linear addition.  Geometrizing the algebra of momentum space is thought of a little differently, but similar things can be said: we think operationally in terms of combining momenta by some process.  First transfer (group-valued) momentum p to a particle, then momentum q – the connection on momentum space tells us how to translate these momenta into the “reference frame” of a new observer with momentum shifted relative to the starting point.  Here again, the special momentum scale m_p (which is also a mass scale since a momentum has a corresponding kinetic energy) is a “deformation” parameter – for momenta that are small compared to this scale, things seem to work linearly as usual.

There’s some discussion in the paper which relates this to DSR (either “doubly” or “deformed” special relativity), which is another postulated limit of quantum gravity, a variation of SR with both a special velocity and a special mass/momentum scale, to consider “what SR looks like near the Planck scale”, which treats spacetime as a noncommutative space, and generalizes the Lorentz group to a Hopf algebra which is a deformation of it.  In DSR, the noncommutativity of “position space” is directly related to curvature of momentum space.  In the “relative locality” view, we accept a classical phase space, but not a classical spacetime within it.

Physical Implications

We should understand this scale as telling us where “quantum gravity effects” should start to become visible in particle interactions.  This is a fairly large scale for subatomic particles.  The Planck mass as usually given is about 21 micrograms: small for normal purposes, about the size of a small sand grain, but very large for subatomic particles.  Converting to momentum units with c, this is about 6 kg m/s: on the order of the momentum of a kicked soccer ball or so.  For a subatomic particle this is a lot.

This scale does raise a question for many people who first hear this argument, though – that quantum gravity effects should become apparent around the Planck mass/momentum scale, since macro-objects like the aforementioned soccer ball still seem to have linearly-additive momenta.  Laurent explained the problem with this intuition.  For interactions of big, extended, but composite objects like soccer balls, one has to calculate not just one interaction, but all the various interactions of their parts, so the “effective” mass scale where the deformation would be seen becomes N m_p where N is the number of particles in the soccer ball.  Roughly, the point is that a soccer ball is not a large “thing” for these purposes, but a large conglomeration of small “things”, whose interactions are “fundamental”.  The “effective” mass scale tells us how we would have to alter the physical constants to be able to treat it as a “thing”.  (This is somewhat related to the question of “effective actions” and renormalization, though these are a bit more complicated.)

There are a number of possible experiments suggested in the paper, which Laurent mentioned in the talk.  One involves a kind of “twin paradox” taking place in momentum space.  In “spacetime”, a spaceship travelling a large loop at high velocity will arrive where it started having experienced less time than an observer who remained there (because of the Lorentzian metric) – and a dual phenomenon in momentum space says that particles travelling through loops (also in momentum space) should arrive displaced in space because of the relativity of localization.  This could be observed in particle accelerators where particles make several transits of a loop, since the effect is cumulative.  Another effect could be seen in astronomical observations: if an observer is observing some distant object via photons of different wavelengths (hence momenta), she might “localize” the object differently – that is, the two photons travel at “the same speed” the whole way, but arrive at different times because the observer will interpret the object as being at two different distances for the two photons.

This last one is rather weird, and I had to ask how one would distinguish this effect from a variable speed of light (predicted by certain other ideas about quantum gravity).  How to distinguish such effects seems to be not quite worked out yet, but at least this is an indication that there are new, experimentally detectible, effects predicted by this “relative locality” principle.  As Laurent emphasized, once we’ve noticed that not accepting this principle means making an a priori assumption about the geometry of momentum space (even if only in some particular approximation, or limit, of a true theory of quantum gravity), we’re pretty much obliged to stop making that assumption and do the experiments.  Finding our assumptions were right would simply be revealing which momentum space geometry actually obtains in the approximation we’re studying.

A final note about the physical interpretation: this “relative locality” principle can be discovered by looking (in the relevant limit) at a Lagrangian for free particles, with interactions described in terms of momenta.  It so happens that one can describe this without referencing a “real” spacetime: the part of the action that allows particles to interact when “close” only needs coordinate functions, which can certainly exist here, but are an observer-dependent construct.  The conservation of (non-linear) momenta is specified via a Lagrange multiplier.  The whole Lagrangian formalism for the mechanics of colliding particles works without reference to spacetime.  Now, even though all the interactions (specified by the conservation of momentum terms) happen “at one location”, in that there will be an observer who sees them happening in the momentum space of her own location.  But an observer at a different point may disagree about whether the interaction was local – i.e. happened at a single point in spacetime.  Thus “relativity of localization”.

Again, this is no more bizarre (mathematically) than the fact that distant, relatively moving, observers in special relativity might disagree about simultaneity, whether two events happened at the same time.  They have their own coordinates on spacetime, and transferring between them mixes space coordinates and time coordinates, so they’ll disagree whether the time-coordinate values of two events are the same.  Similarly, in this phase-space picture, two different observers each have a coordinate system for splitting phase space into “spacetime” and “energy-momentum” coordinates, but switching between them may mix these two pieces.  Thus, the two observers will disagree about whether the spacetime-coordinate values for the different interacting particles are the same.  And so, one observer says the interaction is “local in spacetime”, and the other says it’s not.  The point is that it’s local for the particles themselves (thinking of them as observers).  All that’s going on here is the not-very-astonishing fact that in the conventional picture, we have no problem with interactions being nonlocal in momentum space (particles with very different momenta can interact as long as they collide with each other)… combined with the inability to globally and invariantly distinguish position and momentum coordinates.

What this means, philosophically, can be debated, but it does offer some plausibility to the claim that space and time are auxiliary, conceptual additions to what we actually experience, which just account for the relations between bits of matter.  These concepts can be dispensed with even where we have a classical-looking phase space rather than Hilbert space (where, presumably, this is even more true).

Edit: On a totally unrelated note, I just noticed this post by Alex Hoffnung over at the n-Category Cafe which gives a lot of detail on issues relating to spans in bicategories that I had begun to think more about recently in relation to developing a higher-gauge-theoretic version of the construction I described for ETQFT. In particular, I’d been thinking about how the 2-group analog of restriction and induction for representations realizes the various kinds of duality properties, where we have adjunctions, biadjunctions, and so forth, in which units and counits of the various adjunctions have further duality. This observation seems to be due to Jim Dolan, as far as I can see from a brief note in HDA II. In that case, it’s really talking about the star-structure of the span (tri)category, but looking at the discussion Alex gives suggests to me that this theme shows up throughout this subject. I’ll have to take a closer look at the draft paper he linked to and see if there’s more to say…

Next Page »