So at the instigation of Dan Christensen, I started organizing a seminar at UWO’s math department this year (here is its website). The title I came up with, “Seminar on Stacks, Groupoids and Algebras” is less than self-explanatory, and it remains to be seen how accurate it is, but it was the best title I could think of to express the theme I wanted to investigate. This, roughly, is that there are several ways of looking at “spaces” which carry some built-in symmetry, which have various points of commonality (at least they can be used to describe similar situations) but are based on quite different viewpoints. The seminar is supposed to introduce them, and illuminate some of the points of similarity and difference.

I gave the first talk, the notes for which are here, entitled “Topological and Lie Groupoids, with relations to C*-Algebras and Stacks”. This was mainly about the point of view which I’m most comfortable with, namely groupoids, but also tries to give a brief overview of how groupoids are related to the C*-algebras used in noncommutative geometry to represent some generalizations of spaces (especially to give a nice way to represent quotient spaces for group actions with singular points and similar bad behaviour), and to stacks.

Now, a quotient groupoid (in \mathbf{Set}) comes from a group G acting on a set X. The objects of this groupoid are just the elements of X, and the morphisms are pairs (g,x) \in G \times X, where (g,x) : x \rightarrow g(x). Analogous statements apply to groupoids in \mathbf{Top}, or \mathbf{Diff}, or \mathbf{Aff}, the category of affine schemes which is important in algebraic geometry). The particular example that brought me to this subject came out of looking at ETQFT, and is a fairly particular class of quotient groupoids. Namely, the one in which X is the space of flat G-bundles on a fixed (and let’s assume connected) manifold M. (That is, bundles equipped and G is interpreted as the group of gauge transformations. As we’ll see, this is closely related to the usual motivating examples for stacks.

Most of the geometric substance of my talk dealt with Lie groupoids, which is to say groupoids which have smooth manifolds of both objects and morphisms, and that all the structure maps are morphisms in the category \mathbf{Diff} of differentiable manifolds. In particular, I talked about Lie algebroids – an extension of the idea of Lie algebras to a more differential-geometric setting (a Lie algebroid is a vector bundle over a base manifold M with some algebraic structure – and there is a canonical one associated to a Lie groupoid, which gives the usual Lie algebra of a Lie group G, thought of as a groupoid over the point M = \star. Sections of Lie groupoids are used in Alan Weinstein’s definition of a volume form on a Lie groupoid (or, indeed, a differentiable stack), which generalizes the differential-geometric idea of a volume form to this Lie-theory flavoured situation. Together these give a nice illustration of the interplay between the manifold-like and the Lie-group like properties of Lie groupoids.

I mentioned that Weinstein’s volume form is referred to STACKS. In this context, the usual definition of a stack is that it’s an equivalence class of groupoids. The equivalence relation is Morita equivalence (more on this in the next post, I think). Now, because I was mainly interested in groupoids in \mathbf{Diff}, I mainly paid attention to topological stacks (this concept had been around informally for some time, but as far as I can determine, was formally defined, and several of the important theorems proved, fairly recently – in any case, they are collected in the paper of Behrang Noohi which I cite in the notes). This isn’t, historically, the first use of the concept, though, or the original definition… The origins and motivation of stacks were the topic of the next talk in the seminar.


The second talk was by Ajneet Dhillon, (his slides are here), just entitled “Stacks”. Aji is coming from a background of algebraic geometry, which is the context in which stacks were first proposed. The idea originates with Alexandre Grothendieck, and was taken up and refined by Deligne and Mumford (in this paper), and by Artin. Aji began by explaining the original motivation, which, roughly, is to deal wth problems at arise when trying to describe moduli spaces for certain algebro-geometric structures, by relaxing ones’ goal from a moduli SPACE to a moduli STACK.

Now, a moduli space in this context is more than just a space which parametrizes some collection of objects – this might make sense if the objects are just sets, but if they are spaces (topological, algebraic, differentiable, or otherwise), one needs to be more careful. In particular, a moduli space for structures of some type T should be a space \mathcal{M}, with a universal family of T-structures over it: \mathcal{U} \rightarrow \mathcal{M}, so that the fibre over a point x \in \mathcal{M}, E_x, is a structure of type T.

In particular, he gave the example of the (nonexistent!) moduli space in the case of smooth projective curves of genus 1 with a marked point (essentially, this is a torus \mathbb{C}/\Lambda, where \Lambda is a full sublattice of \mathbb{C} – i.e. one which spans \mathbb{C} as a real vector space, where the marked point corresponds to the lattice points). Then the idea is that the fibres of \mathcal{U} \rightarrow \mathcal{M} are exactly these curves.

Saying this family needs to be universal means that for any other family of such curves is a pullback of it. That is, given any family \mathcal{C} \rightarrow \mathcal{X} whose fibres are these projective curves, there is a map f : \mathcal{X} \rightarrow \mathcal{M} such that \mathcal{C} is the pullback of \mathcal{U} along f. The most obvious case is when \mathcal{X} is a sub-space (in this algebraic context, a subscheme) of \mathcal{M} and f is the inclusion, in which case \mathcal{C} is just the restriction of \mathcal{U}, and the fibre over any point x \in X is the same as in \mathcal{U}.

So to show that such a moduli space doesn’t exist for the projective curves of genus 1, the idea is to construct a family of them which can’t be a pullback. The example is the family of curves \mathbf{C}, the collection of triples of complex numbers (x,y,z) satisfying y^2 = x(x-1)(x-z), parametrized by z – that is, with a map into the affine plane taking each triple to the corresponding z, so that the fibre over z is a projective curve. Now, the fibre over z = \frac{1}{2} has an extra automorphism the other curves don’t – the corresponding lattice is just \mathbb{Z}[i], and the isomorphism givern by i \mapsto -i (which corresponds to the action of the map z \mapsto (1-z). If there were a universal family, we would have to be able to extend this automorphism, but we can’t. So a larger category than schemes is needed to define a “moduli space”. Similar sorts of arguments apply for other sorts of spaces, such as those in \mathbf{Top} or \mathbf{Diff}, though the algebraic setting has some extra complications. Whichever category of spaces we want, call it \mathbf{S}. The main thing is that \mathbf{S} have a Grothendieck topology (essentially, a well-behaved rule to tell if a set of morphisms into X \in \mathbf{S} is a “cover”).

Then the point is, there is a larger category in which \mathbf{S} embeds, namely Fun(\mathbf{S}^{op},\mathbf{Gpd}, the category of (contravariant) weak functors from \mathbf{S} into \mathbf{Gpd} (note: some say “lax” here, but with the target \mathbf{Gpd}, these amount to the same thing). This is by an analog of the Yoneda embedding of \mathbf{S} into Fun(\mathbf{S}^{op},\mathbf{Sets}) taking S \in \mathbf{S} to the functor Hom(-,S) (in fact, since a set is a trivial groupoid, this is a special case).

So in fact a stack is just a lack functor F : \mathbf{S}^{op} \rightarrow \mathbf{Gpd} which satisfies some conditions. Some examples of lax functors that happen to be stacks are:

  • the moduli stack of elliptic curves
  • given a group G (in Aji’s context this is an algebraic group), the “classifying stack” BG of all principal G-bundles
  • the relative version of this: given G and a space X, the stack Bun_X of principal G-bundles on X

The stack conditions are a kind of categorification of the defining conditions for a sheaf, to wit (roughly):

  1. Morphisms can be “glued” – namely, for X \in \mathbf{S} and x,y \in F(X), the mapping Isom(x,y) which takes a morphism f: y \rightarrow x to the set of all isomorphisms from f*x to f*y is a sheaf on $latek \mathcal{S}/X$.
  2. Objects can be “glued” – namely, “all descent data are effective”

Condition 2 means the following. Given a covering of X, say \{ f_i: U_i \rightarrow X \}, a descent datum is a collection of objects x_i \in F(U_i) \in \mathbf{Gpd}, along with isomorphisms \phi_{ij} between restructions to the intersections U_{ij} = U_i \times_X U_j. So \phi_{ij} : x_i|_{U_{ij}} \rightarrow x_j|_{U_{ij}} (restricting an object x_j means pulling it back along the map from $U_{ij})$. To say that a descent datum (\{ x_i \}, \{ \phi_{ij} \} ) is effective means that all the x_i come from pulling back an object in F(X) to U_i along f_i.

This is a weaker condition than the sheaf condition, which insists all the \phi_{ij} are equalities. (In the sheaf case, a paradigm example is the sheaf of continuous functions on a topological space – functions on open sets which agree on their overlap can be “glued” to form a function on the union. In the stack case, a paradigm example is the stack of G-bundles – which can likewise be glued, even when bundles only agree up to isomorphism on an overlap).


Now, as is the way with blog entries, this one is a bit delayed, and Aji gave his second talk today. This included, along with several examples, a description of what properties distinguishes an “algebraic” stack (when \mathbf{S} = \mathbf{Aff} one has to say a little more), an explanation the correspondence between stacks and groupoids (up to equivalence), and an explanation of the definition of a stack as a category fibred in groupoids, but I’ll put off writing that up for the time being…

I say this is about a “recent” talk, though of course it was last year… But to catch up: Ivan Dynov was visiting from York and gave a series of talks, mainly to the noncommutative geometry group here at UWO, about the problem of classifying von Neumann algebras. (Strictly speaking, since there is not yet a complete set of invariants for von Neumann algebras known, one could dispute the following is a “classification”, but here it is anyway).

The first point is that any von Neumann algebra \mathcal{A} is a direct integral of factors, which are highly noncommutative in that the centre of a factor consists of just the multiples of the identity. The factors are the irreducible building blocks of the noncommutative features of \mathcal{A}.

There are two basic tools that provide what classification we have for von Neumann algebras: first, the order theory for projections; second, the Tomita-Takesaki theory. I’ve mentioned the Tomita flow previously, but as for the first part:

A projection (self-adjoint idempotent) is just what it sounds like, if you reprpsent \mathcal{M} as an algebra of bounded operators on a Hilbert space. An extremal but informative case is \mathcal{M} = \mathcal{B}(H), but in general not every bounded operator appears in \mathcal{M}.

In the case where \mathcal{M} = \mathcal{B}(H), then a projection in \mathcal{M} is the same thing as a subspace of H. There is an (orthomodular) lattice of them (in general, the lattice of projections is \mathcal{P(M)}). For subspaces, the dimension characterizes H up to isomorphism – any any two subspaces of the same dimension are isomorphic by some operator in $\mathcal{B}(H)$ (but not necessarily in a general \mathcal{M}).

The idea is to generalize this to projections in a general \mathcal{A}, and get some characterization of \mathcal{A}. The kind of isomorphism that matters for subspaces is a partial isometry – a map u which preserves the metric on some subspace, and otherwise acts as a projection. In fact, the corresponding projections are then conjugate by u. So we define, for a general \mathcal{M}, an equivalence relation on projections, which amounts to saying that e \sim f if there’s a partial isometry u \in \mathcal{M} with e = u*u, and f = uu* (i.e. the projections are conjugate by u).

Then there’s an order relation on the equivalence classes of projections – which, as suggested above, we should think of as generalizing “dimension” from the case \mathcal{M} = \mathcal{B}(H). The order relation says that e \leq f if e \sim e_0 where e_0 \leq f as a projection (i.e. inclusion thinking of a projection as its image subspace of H). But the fact that \mathcal{M} may not be all of \mathcal{B}(H) has some counterintuitive consequences. For example, we can define a projection e \in \mathcal{M} to be finite if the only time e \sim e_0 \leq e is when e_0 = e (which is just the usual definition of finite, relativized to use only maps in \mathcal{M}). We can call  e \in \mathcal{M} a minimal projection if it is nonzero and f \leq e imples f = e or f = 0.

Then the first pass at a classification of factors (i.e. “irreducible” von Neumann algebras) says a factor \mathcal{M} is:

  • Type I: If \mathcal{M} contains a minimal projection
  • Type II: If \mathcal{M} contains no minimal projection, but contains a (nontrivial) finite projection
  • Type III: If \mathcal{M} contains no minimal or nontrivial finite projection

We can further subdivide them by following the “dimension-function” analogy, which captures the ordering of projections for \mathcal{M} = \mathcal{B}(H), since it’s a theorem that there will be a function d : \mathcal{P(M)} \rightarrow [0,\infty] which has the properties of “dimension” in that it gets along with the equivalence relation \sim, respects finiteness, and “dimension” of direct sums. Then letting D be the range of this function, we have a few types. There may be more than one function d, but every case has one of the types:

  • Type I_n: When D = \{0,1,\dots,n\} (That is, there is a maximal, finite projection)
  • Type I_\infty: When D = \{ 0, 1, \dots, \infty \} (If there is an infinite projection in \mathcal{M}
  • Type II_1: When D = [ 0 , 1 ] (The maximal projection is finite – such a case can always be rescaled so the maximum d is 1)
  • Type II_\infty: When D = [ 0 , \infty ] (The maximal projection is infinite – notice that this has the same order type as type II_1)
  • Type III_\infty \: When D = [0,\infty] (An infinite maximal projection)
  • Type III: D = \{0,1\}, (these are called properly infinite)

The type I case are all just (equivalent to) matrix algebras on some countable or finite dimensional vector space – which we can think of as a function space like l_2(X) for some set X. Types II and III are more interesting. Type II algebras are related to what von Neumann called “continuous geometries” – analogs of projective geometry (i.e. geometry of subspaces), with a continuous dimension function.

(If we think of these algebras \mathcal{M} as represented on a Hilbert space H, then in fact, thought of as subspaces of H, all the projections give infinite dimensional subspaces. But since the definition of “finite” is relative to \mathcal{M}, and any partial isometry from a subspace H' \leq H to a proper subspace H'' < H' of itself that may exist in \mathcal{B}(H) is not in M.)

In any case, this doesn’t exhaust what we know about factors. In his presentation, Ivan Dynov described some examples constructed from crossed products of algebras, which is important later, but for the moment, I’ll finish describing another invariant which helps pick apart the type III factors. This is related to Tomita-Takesaki theory, which I’ve mentioned in here before.

You’ll recall that the Tomita flow (associated to a given state \phi) is given by \sigma^{\phi}_t(A) = e^{i \Delta t} A e^{-i \Delta t}, where \Delta is the self-adjoint part of the conjugation operator S (which depends on the state \phi because it refers to the GNS representation of \mathcal{M} on a Hilbert space H). This flow is uninteresting for Type I or II factors, but for type III factors, it’s the basis of Connes’ classification.

In particular, the we can understand the Tomita flow in terms of eigenvalues of \Delta, since it comes from exponentials of \Delta. Moreover, as I commented last time, the really interesting part of the flow is independent of which state we pick. So we are interested in the common eigenvalues of the \Delta associated to different states \phi, and define

S(\mathcal{M}) = \cap_{\phi \in W} Spec(\Delta_{\phi})

(where W is the set of all states on \mathcal{M}, or actually “weights”)

Then S(\mathcal{M}) - \{ 0 \}, it turns out, is always a multiplicative subgroup of the positive real line, and the possible cases refine to these:

  • S(\mathcal{M}) = \{ 1 \} : This is when \mathcal{M} is type I or II
  • S(\mathcal{M}) = [0, \infty ) : Type III_1
  • S(\mathcal{M}) = \{ 0 \} \cup \{ \lambda^n : n \in \mathbb{Z}, 0 < \lambda < 1 \} : Type III_{\lambda} (for each \lambda in the range (0,1), and
  • S(\mathcal{M}) = \{ 0 , 1 \} : Type III_0

(Taking logarithms, S(\mathcal{M}) - \{ 0 \} gives an additive subgroup of \mathbb{R}, \Gamma(\mathcal{M}) which gives the same information). So roughly, the three types are: I finite and countable matrix algebras, where the dimension function tells everything; II where the dimension function behaves surprisingly (thought of as analogous to projective geometry); and III, where dimensions become infinite but a “time flow” dimension comes into play.  The spectra of \Delta above tell us about how observables change in time by the Tomita flow:  high eigenvalues cause the observable’s value to change faster with time, while low ones change slower.  Thus the spectra describe the possible arrangements of these eigenvalues: apart from the two finite cases, the types are thus a continuous positive spectrum, and a discrete one with a single generator.  (I think of free and bound energy spectra, for an analogy – I’m not familiar enough with this stuff to be sure it’s the right one).

This role for time flow is interesting because of the procedures for constructing examples of type III, which Ivan Dynov also described to us. These are examples associated with dynamical systems. These show up as crossed products. See the link for details, but roughly this is a “product” of an algebra by a group action – a kind of von Neumann algebra equivalent of the semidirect product of groups H \rtimes K incorporating an action of K on H. Indeed, if a (locally compact) group K acts on group H then the crossed product of algebras is just the von Neumann algebra of the semidirect product group.

In general, a (W*)-dynamical system is (\mathcal{M},G,\alpha), where G is a locally compact group acting by automorphisms on the von Neumann algebra \mathcal{M}, by the map \alpha : G \rightarrow Aut(\mathcal{M}). Then the crossed product \mathcal{M} \rtimes_{\alpha} G is the algebra for the dynamical system.

A significant part of the talks (which I won’t cover here in detail) described how to use some examples of these to construct particular type III factors. In particular, a theorem of Murray and von Neumann says \mathcal{M} = L^{\infty}(X,\mu) \rtimes_{\alpha} G is a factor if the action of discrete group G on a finite measure space X is ergodic (i.e. has no nontrivial proper invariant sets – roughly, each orbit is dense). Another says this factor is type III unless there’s a measure equivalent to (i.e. absolutely continuous with) \mu, and which is equivariant. Some clever examples I won’t reconstruct gave some factors like this explicitly.

He concluded by talking about some efforts to improve the classification: the above is not a complete set of invariants, so a lot of work in this area is improving the completeness of the set. One set of results he told us about do this somewhat for the case of hyperfinite factors (i.e. ones which are limits of finite ones), namely that if they are type III, they are crossed products of with a discrete group.

At any rate, these constructions are interesting, but it would take more time than I have here to look in detail – perhaps another time.

Last week there was an interesting series of talks by Ivan Dynov about the classification of von Neumann algebras, and I’d like to comment on that, but first, since it’s been a while since I posted, I’ll catch up on some end-of-term backlog and post about some points I brought up a couple of weeks ago in a talk I gave in the Geometry seminar at Western. This was about getting Extended TQFT’s from groups, which I’ve posted about plenty previously . Mostly I talked about the construction that arises from “2-linearization” of spans of groupoids (see e.g. the sequence of posts starting here).

The first intuition comes from linearizing spans of (say finite) sets. Given a map of sets f : A \rightarrow B, you get a pair of maps f^* : \mathbb{C}^B \rightarrow \mathbb{C}^A and f_* : \mathbb{C}^A \rightarrow \mathbb{C}^B between the vector spaces on A and B. (Moving from the set to the vector space stands in for moving to quantum mechanics, where a state is a linear combination of the “pure” ones – elements of the set.) The first map is just “precompose with f“, and the other involves summing over the preimage (it takes the basis vector a \in A to the basis vector f(a) \in B. These two maps are (linear) adjoints, if you use the canonical inner products where A and B are orthonormal bases. So then a span X \stackrel{s}{\leftarrow} S \stackrel{t}{\rightarrow} Y gives rise to a linear map t_* \circ s^* : \mathbb{C}^X \rightarrow \mathbb{C}^Y (and an adjoint linear map going the other way).

There’s more motivation for passing to 2-Hilbert spaces when your “pure states” live in an interesting stack (which can be thought of, up to equivalence, as a groupoid hence a category) rather than an ordinary space, but it isn’t hard to do. Replacing \mathbb{C} with the category \mathbf{FinHilb}_\mathbb{C}, and the sum with the direct sum of (finite dimensional) Hilbert spaces gives an analogous story for (finite dimensional) 2-Hilbert spaces, and 2-linear maps.

I was hoping to get further into the issues that are involved in making the 2-linearization process work with Lie groups, rather than finite groups. Among other things, this generalization ends up requiring us to work with infinite dimensional 2-Hilbert spaces (in particular, replacing \mathbf{FinHilb} with $\mathbf{Hilb}$). Other issues are basically measure-theoretic, since in various parts of the construction one uses direct sums. For Lie groups, these need to be direct integrals. There are also places where counting measure is used in the case of a discrete group G. So part of the point is to describe how to replace these with integrals. The analysis involved with 2-Hilbert spaces isn’t so different for than that required for (1-)Hilbert spaces.

Category theory and measure theory (analysis in general, really), have not historically got along well, though there are exceptions. When I was giving a similar talk at Dalhousie, I was referred to some papers by Mike Wendt, “The Category of Disintegration“, and “Measurable Hilbert Sheaves“, which is based on category-theoriecally dealing with ideas of von Neumann and Dixmier (a similar remark applies Yetter’s paper “Measurable Categories“), so I’ve been reading these recently. What, in the measurable category, is described in terms of measurable bundles of Hilbert spaces, can be turned into a description in terms of Hilbert sheaves when the category knows about measures. But categories of measure spaces are generally not as nice, categorically, as the category of sets which gives the structure in the discrete case. Just for example, the product measure space X \times Y isn’t a categorical product – just a monoidal one, in a category Wendt calls \mathbf{Disint}.

This category has (finite) measure spaces as objects, and as morphisms has disintegrations. A disintegration from (X,\mathcal{A},\mu) to (Y,\mathcal{B},\nu) consists of:

  • a measurable function f : X \rightarrow Y
  • for each y \in Y, the preimage f^{-1}(y) = X_y becomes a measure space (with the obvious subspace sigma-algebra \mathcal{A}_y), with measure \mu_y

such that \mu can be recovered by integrating against $\nu$: that is, for any measurable A \subset X, (that is, A \in \mathcal{A}), we have

$\int_Y \int_{A_y} d\mu_y(x) d\nu(y) = \int_A d\mu(x) = \mu (A)$

where A_y = A \cap X_y.

So the point is that such a morphism gives, not only a measurable function f : X \rightarrow Y, but a way of “disintegrating” X relative to Y. In particular, there is a forgetful functor U : \mathbf{Disint} \rightarrow \mathbf{Msble}, where \mathbf{Msble} is the category of measurable spaces, taking the disintegration (f, \{ (X_y,\mathcal{A}_y,\mu_y) \}_{y \in Y} ) to f.

Now, \mathbf{Msble} is Cartesian; in particular, the product of measurable spaces, X \times Y, is a categorical product. Not true for the product measure space in \mathbf{Disint}, which is just a monoidal category1. Now, in principle, I would like to describe what to do with groupoids in (i.e. internal to), \mathbf{Disint}, but that would involve side treks into things like volumes of measured groupoids, and for now I’ll just look at plain spaces.

The point is that we want to reproduce the operations of “direct image” and “inverse image” for fields of Hilbert spaces. The first thing is to understand what’s mean by a “measurable field of Hilbert spaces” (MFHS’s) on a measurable space X. The basic idea was already introduced by von Neumann not long after formalizing Hilbert spaces. A MFHS’s on (X,\mathcal{A}) consists of:

  • a family \mathcal{H}_x of (separable) Hilbert spaces, for x \in X
  • a space \mathcal{M} \subset \bigoplus_{x \in X}\mathcal{H}_x (of “measurable sections” \phi) (i.e. pointwise inverses to projection maps \pi_x : \mathcal{M} \rightarrow \mathcal{H}_x) with three properties:
  1. measurability: the function x \mapsto ||\phi_x|| is measurable for all \phi \in \mathcal{M}
  2. completeness: if \phi \in \mathcal{M} and \psi \in \bigoplus_{x \in X} \mathcal{H}_x makes the function x \mapsto \langle \phi_x , \psi_x \rangle then \psi \in \mathcal{M}
  3. separability: there is a countable set of sections \{ \phi^{(n)} \}_{n \in \mathbb{N}} \subset \mathcal{M} such that for all x, the \phi^{(n)}_x are dense in \mathcal{H}_x

This is a categorified analog of a measurable function: a measurable way to assign Hilbert spaces to points. Yetter describes a 2-category \mathbf{Meas(X)} of MFHS’s on X, which is an (infinite dimensional) 2-vector space – i.e. an abelian category, enriched in vector spaces. \mathbf{Meas(X)} is analogous to the space of measurable complex-valued functions on X. It is also similar to a measurable-space-indexed version of \mathbf{Vect^k}, the prototypical 2-vector space – except that here we have \mathbf{Hilb^X}. Yetter describes how to get 2-linear maps (linear functors) between such 2-vector spaces \mathbf{Meas(X)} and \mathbf{Meas(Y)}.

This describes a 2-vector space – that is, a \mathbf{Vect}-enriched abelian category – whose objects are MFHS’s, and whose morphisms are the obvious (that is, fields of bounded operators, whose norms give a measurable function). One thing Wendt does is to show that a MFHS \mathcal{H} on X gives rise to measurable Hilbert sheaf – that is, a sheaf of Hilbert spaces on the site whose “open sets” are the measurable sets in $\mathcal{A}$, and where inclusions and “open covers” are oblivious to any sets of measure zero. (This induces a sheaf of Hilbert spaces H on the open sets, if X is a topological space and \mathcal{A} is the usual Borel \sigma-algebra). If this terminology doesn’t spell it out for you, the point is that for any measurable set A, there is a Hilbert space:

H(A) = \int^{\oplus}_A \mathcal{H}_x d\mu(x)

The descent (gluing) condition that makes this assignment a sheaf follows easily from the way the direct integral works, so that H(A) is the space of sections of \coprod_{x \in A} \mathcal{H}_x with finite norm, where the inner product of two sections \phi and \psi is the integral of \langle \phi_x, \psi_x \rangle over A.

The category of all such sheaves on X is called \mathbf{Hilb^X}, and it is equivalent to the category of MFHS up to equivalence a.e. Then the point is that a disintegration (f, \mu_y) : (X,\mathcal{A},\mu) \rightarrow (Y,\mathcal{B},\nu) gives rise to two operations between the categories of sheaves (though it’s convenient here to describe them in terms of MFHS: the sheaves are recovered by integrating as above):

f^* : \mathbf{Hilb^Y} \rightarrow \mathbf{Hilb^X}

which comes from pulling back along f – easiest to see for the MFHS, so that f^*\mathcal{H}_x = \mathcal{H}_{f(x)}, and

\int_f^{\oplus} : \mathbf{Hilb^X} \rightarrow \mathbf{Hilb^Y}

the “direct image” operation, where in terms of MFHS, we have (\int_f^{\oplus}\mathcal{H})_y = \int_{f^{-1}(y)}^{\oplus}\mathcal{H}_x d\mu_y(x). That is, one direct-integrates over the preimage.

Now, these are measure-theoretic equivalents of two of the Grothendieck operations on sheaves (here is the text of Lipman’s Springer Lecture Notes book which includes an intro to them in Ch3 – a bit long for a first look, but the best I could find online). These are often discussed in the context of derived categories. The operation \int_f^{\oplus} is the analog of what is usually called f_*.

Part of what makes this different from the usual setting is that \mathbf{Disint} is not as nice as \mathbf{Top}, the more usual underlying category. What’s more, typically one talks about sheaves of sets, or abelian groups, or rings (which give the case of operations on schemes – i.e. topological spaces equipped with well-behaved sheaves of rings) – all of which are nicer categories than the category of Hilbert spaces. In particular, while in the usual picture f_* is left adjoint to f^*, this condition fails here because of the requirement that morphisms in \mathbf{Hilb} are bounded linear maps – instead, there’s a unique extension property.

Similarly, while f* is always defined by pulling back along a function f, in the usual setting, the direct image functor f_* is left-adjoint to f^*, found by taking a left Kan extension along f. This involves taking a colimit (specifically, imagine replacing the direct integral with a coproduct indexed over the same set). However, in this setting, the direct integral is not a coproduct (as the direct sum would be for vector spaces, or even finite-dimensional Hilbert spaces).

So in other words, something like the Grothendieck operations can be done with 2-Hilbert spaces, but the categorical properties (adjunction, Kan extension) are not as nice.

Finally, I’ll again remark that my motivation is to apply this to groupoids (or stacks), rather than just spaces X, and thus build Extended TQFT’s from (compact) Lie groups – but that’s another story, as we said when I was young.


1 Products: The fact that we want to look at spans in categories that aren’t Cartesian is the reason it’s more general to think about spans, rather than (as you can in some settings such as algebraic geometry) in terms of “bundles over the product”, which is otherwise equivalent. For sets or set-groupoids, this isn’t an issue.

When I made my previous two posts about ideas of “state”, one thing I was aiming at was to say something about the relationships between states and dynamics. The point here is that, although the idea of “state” is that it is intrinsically something like a snapshot capturing how things are at one instant in “time” (whatever that is), extrinsically, there’s more to the story. The “kinematics” of a physical theory consists of its collection of possible states. The “dynamics” consists of the regularities in how states change with time. Part of the point here is that these aren’t totally separate.

Just for one thing, in classical mechanics, the “state” includes time-derivatives of the quantities you know, and the dynamical laws tell you something about the second derivatives. This is true in both the Hamiltonian and Lagrangian formalism of dynamics. The Hamiltonian function, which represents the concept of “energy” in the context of a system, is based on a function H(q,p), where q is a vector representing the values of some collection of variables describing the system (generalized position variables, in some configuration space X), and the p = m \dot{q} are corresponding “momentum” variables, which are the other coordinates in a phase space which in simple cases is just the cotangent bundle T*X. Here, m refers to mass, or some equivalent. The familiar case of a moving point particle has “energy = kinetic + potential”, or H = p^2 / m + V(q) for some potential function V. The symplectic form on T*X can then be used to define a path through any point, which describes the evolution of the system in time – notably, it conserves the energy H. Then there’s the Lagrangian, which defines the “action” associated to a path, which comes from integrating some function L(q, \dot{q}) living on the tangent bundle TX, over the path. The physically realized paths (classically) are critical points of the action, with respect to variations of the path.

This is all based on the view of a “state” as an element of a set (which happens to be a symplectic manifold like T*X or just a manifold if it’s TX), and both the “energy” and the “action” are some kind of function on this set. A little extra structure (symplectic form, or measure on path space) turns these functions into a notion of dynamics. Now a function on the space of states is what an observable is: energy certainly is easy to envision this way, and action (though harder to define intuitively) counts as well.

But another view of states which I mentioned in that first post is the one that pertains to statistical mechanics, in which a state is actually a statisticial distribution on the set of “pure” states. This is rather like a function – it’s slightly more general, since a distribution can have point-masses, but any function gives a distribution if there’s a fixed measure d\mu around to integrate against – then a function like H becomes the measure H d\mu. And this is where the notion of a Gibbs state comes from, though it’s slightly trickier. The idea is that the Gibbs state (in some circumstances called the Boltzmann distribution) is the state a system will end up in if it’s allowed to “thermalize” – it’s the maximum-entropy distribution for a given amount of energy in the specified system, at a given temperature T. So, for instance, for a gas in a box, this describes how, at a given temperature, the kinetic energies of the particles are (probably) distributed. Up to a bunch of constants of proportionality, one expects that the weight given to a state (or region in state space) is just exp(-H/T), where H is the Hamiltonian (energy) for that state. That is, the likelihood of being in a state is inversely proportional to the exponential of its energy – and higher temperature makes higher energy states more likely.

Now part of the point here is that, if you know the Gibbs state at temperature T, you can work out the Hamiltonian
just by taking a logarithm – so specifying a Hamiltonian and specifying the corresponding Gibbs state are completely equivalent. But specifying a Hamiltonian (given some other structure) completely determines the dynamics of the system.

This is the classical version of the idea Carlo Rovelli calls “Thermal Time”, which I first encountered in his book “Quantum Gravity”, but also is summarized in Rovelli’s FQXi essay “Forget Time“, and described in more detail in this paper by Rovelli and Alain Connes. Mathematically, this involves the Tomita flow on von Neumann algebras (which Connes used to great effect in his work on the classification of same). It was reading “Forget Time” which originally got me thinking about making the series of posts about different notions of state.

Physically, remember, these are von Neumann algebras of operators on a quantum system, the self-adjoint ones being observables; states are linear functionals on such algebras. The equivalent of a Gibbs state – a thermal equilibrium state – is called a KMS (Kubo-Martin-Schwinger) state (for a particular Hamiltonian). It’s important that the KMS state depends on the Hamiltonian, which is to say the dynamics and the notion of time with respect to which the system will evolve. Given a notion of time flow, there is a notion of KMS state.

One interesting place where KMS states come up is in (general) relativistic thermodynamics. In particular, the effect called the Unruh Effect is an example (here I’m referencing Robert Wald’s book, “Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics”). Physically, the Unruh effect says the following. Suppose you’re in flat spacetime (described by Minkowski space), and an inertial (unaccelerated) observer sees it in a vacuum. Then an accelerated observer will see space as full of a bath of particles at some temperature related to the acceleration. Mathematically, a change of coordinates (acceleration) implies there’s a one-parameter family of automorphisms of the von Neumann algebra which describes the quantum field for particles. There’s also a (trivial) family for the unaccelerated observer, since the coordinate system is not changing. The Unruh effect in this language is the fact that a vacuum state relative to the time-flow for an unaccelerated observer is a KMS state relative to the time-flow for the accelerated observer (at some temperature related to the acceleration).

The KMS state for a von Neumann algebra with a given Hamiltonian operator has a density matrix \omega, which is again, up to some constant factors, just the exponential of the Hamiltonian operator. (For pure states, \omega = |\Psi \rangle \langle \Psi |, and in general a matrix becomes a state by \omega(A) = Tr(A \omega) which for pure states is just the usual expectation value value for A, \langle \Psi | A | \Psi \rangle).

Now, things are a bit more complicated in the von Neumann algebra picture than the classical picture, but Tomita-Takesaki theory tells us that as in the classical world, the correspondence between dynamics and KMS states goes both ways: there is a flow – the Tomita flow – associated to any given state, with respect to which the state is a KMS state. By “flow” here, I mean a one-parameter family of automorphisms of the von Neumann algebra. In the Heisenberg formalism for quantum mechanics, this is just what time is (i.e. states remain the same, but the algebra of observables is deformed with time). The way you find it is as follows (and why this is right involves some operator algebra I find a bit mysterious):

First, get the algebra \mathcal{A} acting on a Hilbert space H, with a cyclic vector \Psi (i.e. such that \mathcal{A} \Psi is dense in H – one way to get this is by the GNS representation, so that the state \omega just acts on an operator A by the expectation value at \Psi, as above, so that the vector \Psi is standing in, in the Hilbert space picture, for the state \omega). Then one can define an operator S by the fact that, for any A \in \mathcal{A}, one has

(SA)\Psi = A^{\star}\Psi

That is, S acts like the conjugation operation on operators at \Psi, which is enough to define S since \Psi is cyclic. This S has a polar decomposition (analogous for operators to the polar form for complex numbers) of S = J \Delta, where J is antiunitary (this is conjugation, after all) and \Delta is self-adjoint. We need the self-adjoint part, because the Tomita flow is a one-parameter family of automorphisms given by:

\alpha_t(A) = \Delta^{-it} A \Delta^{it}

An important fact for Connes’ classification of von Neumann algebras is that the Tomita flow is basically unique – that is, it’s unique up to an inner automorphism (i.e. a conjugation by some unitary operator – so in particular, if we’re talking about a relativistic physical theory, a change of coordinates giving a different t parameter would be an example). So while there are different flows, they’re all “essentially” the same. There’s a unique notion of time flow if we reduce the algebra \mathcal{A} to its cosets modulo inner automorphism. Now, in some cases, the Tomita flow consists entirely of inner automorphisms, and this reduction makes it disappear entirely (this happens in the finite-dimensional case, for instance). But in the general case this doesn’t happen, and the Connes-Rovelli paper summarizes this by saying that von Neumann algebras are “intrinsically dynamic objects”. So this is one interesting thing about the quantum view of states: there is a somewhat canonical notion of dynamics present just by virtue of the way states are described. In the classical world, this isn’t the case.

Now, Rovelli’s “Thermal Time” hypothesis is, basically, that the notion of time is a state-dependent one: instead of an independent variable, with respect to which other variables change, quantum mechanics (per Rovelli) makes predictions about correlations between different observed variables. More precisely, the hypothesis is that, given that we observe the world in some state, the right notion of time should just be the Tomita flow for that state. They claim that checking this for certain cosmological models, like the Friedman model, they get the usual notion of time flow. I have to admit, I have trouble grokking this idea as fundamental physics, because it seems like it’s implying that the universe (or any system in it we look at) is always, a priori, in thermal equilibrium, which seems wrong to me since it evidently isn’t. The Friedman model does assume an expanding universe in thermal equilibrium, but clearly we’re not in exactly that world. On the other hand, the Tomita flow is definitely there in the von Neumann algebra view of quantum mechanics and states, so possibly I’m misinterpreting the nature of the claim. Also, as applied to quantum gravity, a “state” perhaps should be read as a state for the whole spacetime geometry of the universe – which is presumably static – and then the apparent “time change” would then be a result of the Tomita flow on operators describing actual physical observables. But on this view, I’m not sure how to understand “thermal equilibrium”.  So in the end, I don’t really know how to take the “Thermal Time Hypothesis” as physics.

In any case, the idea that the right notion of time should be state-dependent does make some intuitive sense. The only physically, empirically accessible referent for time is “what a clock measures”: in other words, there is some chosen system which we refer to whenever we say we’re “measuring time”. Different choices of system (that is, different clocks) will give different readings even if they happen to be moving together in an inertial frame – atomic clocks sitting side by side will still gradually drift out of sync. Even if “the system” means the whole universe, or just the gravitational field, clearly the notion of time even in General Relativity depends on the state of this system. If there is a non-state-dependent “god’s-eye view” of which variable is time, we don’t have empirical access to it. So while I can’t really assess this idea confidently, it does seem to be getting at something important.

Last Friday, UWO hosted a Distinguished Colloquium talk by Gregory Chaitin, who was talking about a proposal for a new field he calls “metabiology”, which he defined in the talk (and on the website above) as “a field parallel to biology, dealing with the random evolution of artificial software (computer programs) rather than natural software (DNA), and simple enough that it is possible to prove rigorous theorems or formulate heuristic arguments at the same high level of precision that is common in theoretical physics.” This field doesn’t really exist to date, but his talk was intended to argue that it should, and to suggest some ideas as to what it might look like. It was a well-attended talk with an interdisciplinary audience including (at least) people from the departments of mathematics, computer science, and biology. As you might expect for such a talk, it was also fairly nontechnical.

A lot of the ideas presented in the talk overlapped with those in this outline, but to summarize… One of the motivating ideas that he put forth was that there is currently no rigorous proof that Darwin-style biological evolution can work – i.e. that operations of mutation and natural selection can produce systems of very high complexity. This is a fundamental notion in biology, summarized by the slogan, “Nothing in biology makes sense except in light of evolution”. This phrase, funnily, was coined as the title of a defense of a “theistic evolution” – not obviously a majority position among scientists, but also not to be confused with “intelligent design” which claims that evolution can’t account for observed features of organisms. This is a touchy political issue in some countries, and it’s not obvious that a formal proof that mutation and selection CAN produce highly complex forms would resolve it. Even so, as Chaitin said, it seems likely that such a proof could exist – but if there’s a rigorous proof of the contrary, that would be good to know also!

Of course, such a formal proof doesn’t exist because formal proof doesn’t play much role in biology, or any other empirical science – since living things are very complex, and incompletely understood. Thus the proposal of a different field, “metabiology”, which would study simpler formal objects: “artificial software” in the form of Turing machines or program code, as opposed to “natural software” like DNA. This abstracts away everything about an organism except its genes (which is a lot!), with the aim of simplifying enough to prove that mutation and selection in this toy world can generate arbitrarily high levels of complexity.

Actually stating this precisely enough to prove ties in to the work that Chaitin is better known for, namely the study of algorithmic complexity and theoretical computer science. The two theorems Chaitin stated (but didn’t prove in the talk) did not – he admitted – really meet that goal, but perhaps did point in that direction. One measure of complexity is computability – that is, the size of a Turing machine (for example, though a similar definition applies to other universal ways of describing algorithms) which is needed to generate a particular pattern. A standard example is the “Busy Beaver function“, and one way to define
it is to say that B(n) is the largest number printed out by an n-state Turing machine which then halts. Since the halting problem is uncomputable (i.e. there’s no Turing machine which, given a description of another machine, can always decide whether or not it halts), for reasons analogous to Cantor’s diagonal argument or Godel’s incompleteness theorem, generating B(n), or a sequence of the same order, is a good task to measure complexity.

So the first toy model involved a single organism, being replaced in each generation by a mutant form. The “organism” is a Turing machine (or a program in some language, etc. – one key result from complexity theory is that all these different ways to specify an algorithm can simulate each other, with the addition of at most a fixed-size prefix, which is the part of the algorithm describing how to do the simulation). In each generation, it is mutated. The mutant replaces the original organism if: (a) the new code halts, and (b) outputs a number which (c) is larger than the number produced by the original. Now, this decision procedure is uncomputable since it requires solving the halting problem – so in particular, there’s no way to simulate this process. But the theorem says that, in exponential time (i.e. t(n) \sim O(e^n)), this process will produce a machine which produces a number of order B(n). That is, as long as the “environment” (the thing doing the selection) can recognize and reward complexity, mutation is sufficient to produce it. But these are pretty big assumptions, which is one reason this theorem isn’t quite what’s wanted.

Still, within it’s limited domain, he also stated a theorem to the effect that, for any given level of complexity (in the above sense), there is a path through the space of possible programs which reaches it, such that the “mutation distance” (roughly, the negative logarithm of the probability of a mutation occurring) at each step is bounded, and the complexity (therefore fitness, in this toy model) increases at each step. He indicated that one could prove this using the bits of the halting probability Omega – he didn’t specify how, and this isn’t something I’m very familiar with, but apparently (as describeded in the linked article), there are somewhat standard ways to do this kind of thing.

So anyway, this little toy model doesn’t really do the job Chaitin is saying ought to be done, but it illustrates what the kind of theorems he’s asking for might look like. My reaction is that it would be great to have theorems like this that could tell us something meaningful about real biology (so the toy model certainly is too simple), though I’m not totally convinced there needs to be a “new field” for such study. But certainly theoretical biology seems to be much less developed than, say, theoretical physics, and even if rigorous proofs aren’t going to be as prominent there, if some can be found, it probably couldn’t hurt.

After the talk, there was some interesting discussion about other things going on in theoretical biology and “systems biology“.  Chaitin commented that a lot of the work in this field involves detailed simulations of models of real systems, made as accurate as possible – which, while important, is different from the kind of pursuit of basic theoretical principles he was talking about.  So this would include things like: modeling protein folding; studying patterns in big databases of gene frequencies in populations and how they change in time; biophysical modeling of organs and the biochemical reactions in them; simulating the dynamics of individual cells, their membranes and the molecular machinery that makes them work; and so on.  All of which has been moving rapidly in recent years,  but is only tangentially related to fundamental principles about how life works.

On the other hand, as audience members pointed out, there is another thread, exemplified by the Santa Fe Institute, which is more focused on understanding the dynamics of complex systems.  Some well-known names in this area would be Stuart Kauffman, John Holland and Per Bak, among others.  I’ve only looked into this stuff at the popular level, but there are some interesting books about their work – Holland’s “Hidden Order”, Kauffman’s “The Origins of Order” (more technical) and “At Home in the Universe” (more popular), and Solé and Goodwin’s “Signs of Life” (a popular survey, but with equations, of various
aspects of mathematical approaches to biological complexity).  Chaitin’s main comment on this stuff is that it has produced plenty of convincing heuristic arguments, simulations and models with suggestive behaviour, and so on – but not many rigorous theorems.  So: it’s good, but not exactly what he meant by “metabiology”.

Summarizing this stuff would be a big task in itself, but it does connect to Chaitin’s point that it might be nice to know (rigorously) if Darwinian evolution by itself were NOT enough to explain the complexity of living things.  Stuart Kauffman, for example, has suggested that certain kinds of complex order tend to arise through “self-organization”.  Philosopher Daniel Dennett
commented on this in “Darwin’s Dangerous Idea”, saying that although this might be true, at most it tells us more detail about what kinds of things Darwinian selection has available to act on.

This all seems to tie into the question over which appeared first as life was first coming into being: self-replicating molecules like RNA (and later DNA), or cells with metabolic reactions occurring inside.  Organisms obviously both reproduce and metabolize, but these are two quite different kinds of process, and there seems to be a “chicken-and-egg” problem with which came first.  Kauffman, among others, has looked at the emergence of “autocatalytic networks” of chemical reactions: these are collections of chemical reactions, some or all of which needing a catalyst, such that all the catalysts needed to make them run are products of some reaction in the network.  They’ve shown in simulation that such networks can arise spontaneously under certain conditions – suggesting that metabolism might have come into existence without DNA or similar molecules around (one also thinks of larger phenomena, like the nitrogen cycle).  In any case, this is the kind of thing which people sometimes point to when suggesting that Darwinian selection isn’t enough to completely explain the structure of organisms actually existing today.  Which is a different claim (mind you) than the claim that Darwinian evolution could not possibly produce complex organisms.  Chaitin’s whole motivation was to suggest that it should be provable one way or the other (and, he presumes, in the affirmative) whether mutation and selection CAN do this job.  If it could be proved that it can’t – at least there are some other ingredients to consider.

All in all, I found the talk thought-provoking, in spite (or because) of being partial and inconclusive.  Biology may be less rigorous than physics, but this could just be a sign that there’s a lot to learn and do in the field – and a lot of it is being done!

I just posted the slides for “Groupoidification and 2-Linearization”, the colloquium talk I gave at Dalhousie when I was up in Halifax last week. I also gave a seminar talk in which I described the quantum harmonic oscillator and extended TQFT as examples of these processes, which covered similar stuff to the examples in a talk I gave at Ottawa, as well as some more categorical details.

Now, in the previous post, I was talking about different notions of the “state” of a system – all of which are in some sense “dual to observables”, although exactly what sense depends on which notion you’re looking at. Each concept has its own particular “type” of thing which represents a state: an element-of-a-set, a function-on-a-set, a vector-in-(projective)-Hilbert-space, and a functional-on-operators. In light of the above slides, I wanted to continue with this little bestiary of ontologies for “states” and mention the versions suggested by groupoidification.

State as Generalized Stuff Type

This is what groupoidification introduces: the idea of a state in Span(Gpd). As I said in the previous post, the key concepts behind this program are state, symmetry, and history. “State” is in some sense a logical primitive here – given a bunch of “pure” states for a system (in the harmonic oscillator, you use the nonnegative integers, representing n-photon energy states of the oscillator), and their local symmetries (the n-particle state is acted on by the permutation group on n elements), one defines a groupoid.

So at a first approximation, this is like the “element of a set” picture of state, except that I’m now taking a groupoid instead of a set. In a more general language, we might prefer to say we’re talking about a stack, which we can think of as a groupoid up to some kind of equivalence, specifically Morita equivalence. But in any case, the image is still that a state is an object in the groupoid, or point in the stack which is just generalizing an element of a set or point in configuration space.

However, what is an “element” of a set S? It’s a map into S from the terminal element in \mathbf{Sets}, which is “the” one-element set – or, likewise, in \mathbf{Gpd}, from the terminal groupoid, which has only one object and its identity morphism. However, this is a category where the arrows are set maps. When we introduce the idea of a “history “, we’re moving into a category where the arrows are spans, A \stackrel{s}{\leftarrow} X \stackrel{t}{\rightarrow} B (which by abuse of notation sometimes gets called X but more formally (X,s,t)). A span represents a set/groupoid/stack of histories, with source and target maps into the sets/groupoids/stacks of states of the system at the beginning and end of the process represented by X.

Then we don’t have a terminal object anymore, but the same object 1 is still around – only the morphisms in and out are different. Its new special property is that it’s a monoidal unit. So now a map from the monoidal unit is a span 1 \stackrel{!}{\rightarrow} X \stackrel{\Phi}{\rightarrow} B. Since the map on the left is unique, by definition of “terminal”, this really just given by the functor \Phi, the target map. This is a fibration over B, called here \Phi for “phi”-bration, but this is appropriate, since it corresponds to what’s usually thought of as a wavefunction \phi.

This correspondence is what groupoidification is all about – it has to do with taking the groupoid cardinality of fibres, where a “phi”bre of \Phi is the essential preimage of an object b \in B – everything whose image is isomorphic to b. This gives an equivariant function on B – really a function of isomorphism classes. (If we were being crude about the symmetries, it would be a function on the quotient space – which is often what you see in real mechanics, when configuration spaces are given by quotients by the action of some symmetry group).

In the case where B is the groupoid of finite sets and bijections (sometimes called \mathbf{FinSet_0}), these fibrations are the “stuff types” of Baez and Dolan. This is a groupoid with something of a notion of “underlying set” – although a forgetful functor U: C \rightarrow \mathbf{FinSet_0} (giving “underlying sets” for objects in a category C) is really supposed to be faithful (so that C-morphisms are determined by their underlying set map). In a fibration, we don’t necessarily have this. The special case corresponds to “structure types” (or combinatorial species), where X is a groupoid of “structured sets”, with an underlying set functor (actually, species are usually described in terms of the reverse, fibre-selecting functor \mathbf{FinSet_0} \rightarrow \mathbf{Sets}, where the image of a finite set consists of the set of all “$\Phi$-structured” sets (such as: “graphs on set S“, or “trees on S“, etc.) The fibres of a stuff type are sets equipped with “stuff”, which may have its own nontrivial morphisms (for example, we could have the groupoid of pairs of sets, and the “underlying” functor \Phi selects the first one).

Over a general groupoid, we have a similar picture, but instead of having an underlying finite set, we just have an “underlying B-object”. These generalized stuff types are “states” for a system with a configuration groupoid, in Span(\mathbf{Gpd}). Notice that the notion of “state” here really depends on what the arrows in the category of states are – histories (i.e. spans), or just plain maps.

Intuitively, such a state is some kind of “ensemble”, in statistical or quantum jargon. It says the state of affairs is some jumble of many configurations (which we apparently should see as histories starting from the vacuous unit 1), each of which has some “underlying” pure state (such as energy level, or what-have-you). The cardinality operation turns this into a linear combination of pure states by defining weights for each configuration in the ensemble collected in X.

2-State as Representation

A linear combination of pure states is, as I said, an equivariant function on the objects of B. It’s one way to “categorify” the view of a state as a vector in a Hilbert space, or map from \mathbb{C} (i.e. a point in the projective Hilbert space of lines in the Hilbert space H = \mathbb{C}[\underline{B}]), which is really what’s defined by one of these ensembles.

The idea of 2-linearization is to categorify, not a specific state \phi \in H, but the concept of state. So it should be a 2-vector in a 2-Hilbert space associated to B. The Hilbert space H was some space of functions into $mathbb{C}$, which we categorify by taking instead of a base field, a base category, namely \mathbf{Vect}_{\mathbb{C}}. A 2-Hilbert space will be a category of functors into \mathbf{Vect}_{\mathbb{C}} – that is, the representation category of the groupoid B.

(This is all fine for finite groupoids. In the inifinte case, there are some issues: it seems we really should be thinking of the 2-Hilbert space as category of representations of an algebra. In the finite case, the groupoid algebra is a finite dimensional C*-algebra – that is, just a direct sum (over iso. classes of objects) of matrix algebras, which are the group algebras for the automorphism groups at each object. In the infinite dimensional world, you probable should be looking at the representations of the von Neumann algebra completion of the C*-algebra you get from the groupoid. There are all sorts of analysis issues about measurability that lurk in this area, but they don’t really affect how you interpret “state” in this picture, so I’ll skip it.)

A “2-state”, or 2-vector in this Hilbert space, is a representation of the groupoid(-algebra) associated to the system. The “pure” states are irreducible representations – these generate all the others under the operations of the 2-Hilbert space (“sum”, “scalar product”, etc. in their 2-vector space forms). Now, an irreducible representation of a von Neumann algebra is called a “superselection sector” for a quantum system. It’s playing the role of a pure state here.

There’s an interesting connection here to the concept of state as a functional on a von Neumann algebra. As I described in the last post, the GNS representation associates a representation of the algebra to a state. In fact, the GNS representation is irreducible just when the state is a pure state. But this notion of a superselection sector makes it seem that the concept of 2-state has a place in its own right, not just by this correspondence.

So: if a quantum system is represented by an algebra \mathcal{A} of operators on a Hilbert space H, that representation is a direct sum (or direct integral, as the case may be) of irreducible ones, which are “sectors” of the theory, in that any operator in \mathcal{A} can’t take a vector out of one of these “sectors”. Physicists often associate them with conserved quantities – though “superselection” sectors are a bit more thorough: a mere “selection sector” is a subspace where the projection onto it commutes with some subalgebra of observables which represent conserved quantities. A superselection sector can equivalently be defined as a subspace whose corresponding projection operator commutes with EVERYTHING in \mathcal{A}. In this case, it’s because we shouldn’t have thought of the representation as a single Hilbert space: it’s a 2-vector in \mathbb{Rep}(\mathcal{A}) – but as a direct integral of some Hilbert bundle that lives on the space of irreps. Those projections are just part of the definition of such a bundle. The fact that \mathcal{A} acts on this bundle fibre-wise is just a consequence of the fact that the total H is a space of sections of the “2-state”. These correspond to “states” in usual sense in the physical interpretation.

Now, there are 2-linear maps that intermix these superselection sectors: the ETQFT picture gives nice examples. Such a map, for example, comes up when you think of two particles colliding (drawn in that world as the collision of two circles to form one circle). The superselection sectors for the particles are labelled by (in one special case) mass and spin – anyway, some conserved quantities. But these are, so to say, “rest mass” – so there are many possible outcomes of a collision, depending on the relative motion of the particles. So these 2-maps describe changes in the system (such as two particles becoming one) – but in a particular 2-Hilbert space, say \mathbb{Rep}(X) for some groupoid X describing the current system (or its algebra), a 2-state \Phi is a representation of the of the resulting system). A 2-state-vector is a particular representation. The algebra \mathcal{A} can naturally be seen as a subalgebra of the automorphisms of \Phi.

So anyway, without trying to package up the whole picture – here are two categorified takes on the notion of state, from two different points of view.

I haven’t, here, got to the business about Tomita flows coming from states in the von Neumann algebra sense: maybe that’s to come.

In my post about my short talk at CQC, I mentioned that the groupoidification program in physics is based on a few simple concepts (most research programs are, I suppose). The ones I singled out are: state, symmetry, and history. But since concepts tend to seem simpler if you leave them undefined, there are bound to be subtleties here. Recently I’ve been thinking about the first one, state. What is a state? What is this supposedly simple concept?

Etymology isn’t an especially reliable indicator of what a word means, or even the history of a concept (words change meanings, and concepts shift over time), but it’s sometimes interesting to trace. The English word “state” comes from the Latin verb stare, meaning “to stand”, whose past participle is status, which is also borrowed directly into English. The Proto-Indoeuropean root sta- also means “stand”, which in turn comes from this root, but this time via Germanic (along with “standard”). However, most of the words with this root come via various Latin intermediaries: state, stable, status, statue, stationary, station, and also substance, understand and others. The state of affairs is sometimes referred to as being “how things stand”, how they are, the current condition. Most of the words based on the sta- root imply non-motion (i.e. “stasis”). If anything, “state” (like “status”) carries this connotation less strongly than most, since the state of affairs can change – but it emphasizes how things stand now and not how they’re changing. From this sense, we also get the political meaning of “a state”, a reified version of a term originally meaning the political condition of a country (by analogy with Latin expressions like status rei publicae, the “condition of public affairs”).

So, narrowing focus now, the “state” of a physical system is the condition it’s in. In different models of physics, this is described in different ways, but in each case, by the “condition” we mean something like a complete description of all the facts about the system we can get. But this means different things in different settings. So I just want to take a look at some of them.

Think of these different settings for physics as being literally “settings” (but please excuse the pun) of the switches on a machine. Three of the switches are labelled Thermal, Quantum, and Relativistic. The “Thermal” switch varies whether or not we’re talking about thermodynamics or ordinary mechanics. The “Quantum” switch varies whether we’re talking about a quantum or classical system.

The “Relativistic” switch, which I’ll ignore for this post, specifies what kind of invariance we have: Galileian for Newton’s physics; Lorentzian for Special Relativity; general covariance for General Relativity. But this gets into dynamics, and “state” implies things are, well, static – that is, it’s about kinematics. At the very least, in Relativity, it’s not canonical what you mean by “now”, and so the definition of a state must include choosing a reference frame (in SR), or a Cauchy hypersurface (in GR). So let’s gloss over that for now.

When all these switches are in the “off” position, we have classical mechanics. Here, we think of a state as – at a first level of approximation, an element of a set. Now, for serious classical mechanics, this set will be a symplectic manifold, like the cotangent bundle T^*M of some manifold M. This is actually a bit subtle already, since a point in T^*M represents a collection of positions and momenta (or some generalization thereof): that is, we can start with a space of “static” configurations, parametrized by the values of some observable quantities, but a state (contrary to what etymology suggests) also includes momenta describing how those quantities are changing with time (which, in classical mechanics, is a fairly unproblematic notion).

The Hamiltonian picture of the dynamics of the system then tells us: given its state, what will be the acceleration, which we can then use to calculate states at future time. This requires a Hamiltonian, H, which we think of as the energy, which can be calculated from the state. So, for example, kinetic plus potential energy: in the case of a particle moving in a potential on a line, H = K + V = p^2/m + V(q). The space of states can be described without much reference to the Hamiltonian, but once we have H, we get a flow on that space, transforming old states into new states with time.

Now if we turn on the “Thermal” switch, we have a different notion of state. The standard image for the classical mechanical system is that we may be talking about a particle, or a few particles, or perhaps a rigid object, moving in space, maybe subject to some constraints. In thermodynamics, we are thinking of a statistical ensemble of objects – in the simplest case, N identical objects – and want to ask how energy is distributed among them. The standard image is of a box full of gas at some temperature: it’s full of molecules, each with its own trajectory, and they interact through collisions and exchange energy and momentum. Rather than tracking the exact positions of molecules, in thermodynamics a “state” is a distribution, or more precisely a probability measure, on the space of such states. We don’t assume we know the detailed microstate of the system – the positions and momenta of all the particles in the gas – but only something about how these are distributed among them. This reflects the real fact that we can only measure things like pressure, temperature, etc. The measure is telling us the proportion of particles with positions and momenta in a given range.

This is a big difference for something described by the same word “state”. Even assuming our underlying space of “microstates” is still the same T^*M, the state is no longer a point. One way to interpret the difference is that here the state is something epistemic. It describes what we know about the system, rather than everything about it. The measure answers the question: “given what we know, what is the likelihood the system is in microstate X?” for each X. Now, of course, we could take a space of all such measures: given our previous classical system, it’s a space of functionals on C(T^*M). Then the state can again be seen as an element of a set. But it’s more natural to keep in view its nature as a measure, or, if it’s nice enough, as a positive function on the space of states. (It’s interesting that this is an object of the same type as the Hamiltonian – this is, intuitively, the basis of what Carlo Rovelli calls the “Thermal Time Hypothesis”, summarized here, which is secretly why I wanted to write on this topic. But more on that in a later post. For one thing, before I can talk about it, I have to talk about what comes next.)

Now turn off the “Thermal” switch, and think about the “Quantum” switch. Here there are a couple of points of view.

To begin with, we describe a system in terms of a Hilbert space, and a state is a vector in a Hilbert space. Again, this could be described as an element of a set, but the complex linear structure is important, so we keep thinking of it as fundamental to the type of a state. In geometric quantization, one often starts with a classical system with a state space like T^*M = X, and then takes the Hilbert space \mathcal{H}=L^2(X), so that a state is (modulo analysis issues) basically a complex-valued function on X. This is something like the (positive real-valued) measure which gives a thermodynamic state, but the interpretation is trickier. Of course, if \mathcal{H} is an L^2-space, we can recover a probability measure, since the square modulus of \phi \in \mathcal{H} has finite total measure (so we can normalize it). But this isn’t enough to describe \phi, and the extra information of phases goes missing. In any case, the probability measure no longer has the obvious interpretation of describing the statistics of a whole ensemble of identical systems – only the likelihood of measuring particular values for one system in the state \phi. (In fact, there are various no-go theorems getting in the way of a probablity interpretation of \phi, though this again involves dynamics – a recurring theme is that it’s hard to reason sensibly about states without dynamics). So despite some similarity, this concept of “state” is very different, and phase is a key part of how it’s different. I’ll be jiggered if I can say why, though: most of the “huh?” factor in quantum mechanics lives right about here.

Another way to describe the state of a quantum system is related to this probability, though. The inner product of \mathcal{H} (whether we found it as an L^2-space or not) gives a way to talk about statistics of the system under repeated observations. Observables, which for the classical picture are described by functions on the state space X, are now self-adjoint operators on \mathcal{H}. The expectation value for an observable A in the state \phi is $\langle \phi | A | \phi \rangle$ (note that the Dirac notation implicitly uses self-adjointness of A). So the state has another, intuitively easier, interpretation: it’s a real-valued functional on observables, namely the one I just described.

The observables live in the algebra \mathcal{A} = \mathcal{B}(\mathcal{H}) of bounded operators on \mathcal{H}. Setting both Thermal and Quantum switches of our notion of “state” gives quantum statistical mechanics. Here, the “C*-algebra” (or von Neumann-algebra) picture of quantum mechanics says that really it’s the algebra \mathcal{A} that’s fundamental – it corresponds to actual operations we can perform on the system. Some of them (the self-adjoint ones) represent really very intuitive things, namely observables, which are tangible, measurable quantities. In this picture, \mathcal{H} isn’t assumed to start with at all – but when it is, the kind of object we’re dealing with is a density matrix. This is (roughly) a positive operator on \mathcal{H} of unit trace). In general a state on a von Neumann algebra is a linear functional with unit trace.

This is analogous to the view of a state as a probability measure (positive function with unit total integral) in the classical realm: if an observable is a function on states (giving the value of that observable in each state), then a measure is indeed a functional on the space of observables. A probability measure, in fact, is the functional giving the expectation value of the observable. (And, since variance and all the higher moments of the probability distribution for that observable are themselves defined as expectation values, it also tells us all of those.)

On the other hand, the Gelfand-Naimark-Segal theorem says that, given a state \phi : \mathcal{A} \rightarrow \mathbb{R}, there’s a representation of \mathcal{A} as an algebra of operators on some Hilbert space, and a vector v for which this \phi is just \phi(A) = \langle v | A | v \rangle. This is the GNS representation (and in fact it’s built by taking the regular representation of \mathcal{A} on itself by multiplication, with \mathcal{A} made into a Hilbert space by definining the inner product to make this property work, and with v = 1). So the view here is that a state is some kind of operation on observables – a much more epistemic view of things. So although the GNS theorem relates this to the vector-in-Hilbert-space view of “state”, they are quite different conceptually. (For one thing, the GNS representation is giving a different Hilbert space for each state, which undermines the sense that the space of ALL states is fundamentally “there”, but in both pictures \mathcal{A} is the same for all states.)

(This von Neumann-algebra point of view, by the way, gets along nicely with the 2-Hilbert space lens for looking at quantum mechanics, which may partly bridges the gap between it and the Hilbert-space view. The category of representations of a von Neumann algebra is a 2-Hilbert space. A “2-vector” (or “2-state”, if you like) in this category is a representation of the algebra. So the GNS representation itself is a “2-state”. This raises the question about 2-algebras of 2-operators, and John Baez’ question: “What is the categorified GNS theorem?” But let’s leave 2-states for later along with the rest.)

So where does this leave us regarding the meaning of “state”? The classical view is that a state is an element of some (structured) set. The usual quantum picture is that a state is, depending on how precise you want to be, either a vector in a Hilbert space, or a 1-d subspace of that Hilbert space – that is, a point in the projective Hilbert space. What these two views have in common is that there is some space of all “possible worlds”, i.e. of all ways things can be in the system being studied. A state is then a way of selecting one of these. The difference is in what this space of possible worlds is like – that is, which category it lives in – and how exactly one “selects” a state. How they differ is in the possibility of taking combinations of states. As for selecting states, Sets is a Cartesian category, with a terminal object 1 = {*}: an element of a set is a map from 1 into it. Hilb is a monoidal category, but not Cartesian: selecting a single vector has no obvious categorical equivalent, though selecting a 1-D subspace amounts to a map from \mathbb{C} (up to isomorphism). So the model of an “element” isn’t a singleton, it’s the complex line – and it relates to other possible spaces differently: not as a terminal object, but as a monoidal unit. This is a categorical way of saying how the idea of “state” is structurally different.

The thermal point of view is a little more epistemically subtle: for both classical and quantum pictures, it’s best thought of as, not a possible world, but a function acting on observables (that is, conditions of knowledge). In the classical picture, this is directly related to a space of possible worlds – it’s a measure on it, which we can think of as saying how a large ensemble of systems are distributed in that space. In the quantum picture, in some ways the most (epistemically) natural view, in terms of von Neumann algebras, breaks the connection to this notion of “possible worlds” altogether, since \mathcal{A} has representations on many different Hilbert spaces?

So a philosophical question is: what do these different concepts have in common that lets us use them all to represent the “same” root idea? Without actually answering this, I’ll just mention that at some point I’d like to talk a bit about “2-states” as 2-vectors, and in general how to categorify everything above.

So this paper of mine was recently accepted by the Journal of Homotopy and Related Structures (the version that was accepted should be reflected on the arXiv by tomorrow – i.e. July 10 – I’m not sure about the journal ). It’s been a while since I sent out the earliest version, and most of the changes have involved figuring out who the audience is, and consequently what could be left out. I guess that’s a side-effect of taking an excerpt from my thesis, which was much longer. In any case, it now seems to have reached a final point. Some of what was in it – the section about cobordisms – is now in a paper (in progress) about TQFT. I don’t see anywhere else to include the other missing bit, however, which has to do with Lawvere theories, and since I just wrote a bunch about MakkaiFest, I thought I might include some of that here.

The paper came about because I was trying to write my thesis, which describes an extended TQFT as a 2-functor (and considers how it could produce a version of 3D quantum gravity). The 2-functor

Z_G : nCob_2 \rightarrow 2Vect

(or into 2Hilb) is an ETQFT. The construction of the 2-functor uses the fact that you can get spans of groupoids out of cospans of manifolds – and in particular, out of cobordisms. One problem is how to describe nCob_2 so that this works. It’s actually most naturally a cubical 2-category of some kind. The strict version of this concept is a double category – which has (in principle separate) categories of horizontal and vertical of morphisms, as well as square 2-cells. Ideally, one would like a “weak” version, where composition of squares and morphisms can be only weakly associative (and have weak unit laws). A “pseudocategory” implements this where the only higher-dimensional morphisms are the squares, but it turns out to be strict in one direction, and weak in the other. As it happens, it’s a big pain to use only squares for the 2-morphisms.

Initially it seemed I would have to define a whole new structure to get weak composition in both directions, because in both directions, composition represents gluing bits of manifolds together along boundaries – using a diffeomorphism (or a smooth homeomorphism, depending on which kind of manifolds we’re dealing with). I called it a “double bicategory” and started trying to define it along the same lines as a double category. It then turned out that Dominic Verity had already defined a “double bicategory” – you can read the paper where I talk about how the notions are related. Here I want to talk about a few aspects which I cut out of the paper along the way.

The idea is that there are two ways of “categorifying”: internalization, and enrichment. A bicategory is a category enriched in Cat, the category of categories – for any two elements, there’s a whole hom-category of morphisms (and 2-morphisms). A double category is a category internal to Cat. This means you can think of it as a category of objects and a category of morphisms, equipped with functors satisfying all the usual properties for the maps in the definition of a category: composition functors, unit functors, and so forth. This definition turns out to be equivalent to the usual one. So I thought: why not do the same with bicategories?

Thus, the way I defined double bicategory was: “A bicategory internal to Bicat“. In the paper as it stands, that’s all I say. What I cut out was a sort of dangling loose end pointing toward Lawvere theories – or rather, a variant thereof – finite limit theories (for something more detailed, see this recent paper by Lack and Rosicky). As I mentioned in the previous post, a Lawvere theory is an approach to universal algebra – it formally defines a kind of object (e.g. group, ring, abelian group, etc.) as a functor from a category T which is the “theory” of such objects, while the functor is a “model” of the theory.

What makes it “universal” algebra is that it can involve definitions with many sorts of objects, many operations, given as arrows, of different arities (number of inputs and outputs). This last makes sense in the monoidal context, and in particular Cartesian. Making decisions like this – what class of categories and functors we’re dealing with – specifies which doctrine the theory lives in. In the case of bicategories, this is the doctrine of categories with finite limits. In a Lawvere theory in the original sense, the doctrine is categories with finite products – so if there’s an object G, there are also objects G^n for all n. Then there are things like multiplication maps m : G^2 \rightarrow G and so on. For a category or bicategory, multiplication might be partial – so we need finite limits. A model of a theory in this doctrine is a limit-preserving functor.

So what does the theory of bicategories look like? It’s easy enough to see if you think that a (small) bicategory is a “bicategory in Sets“, and reproduce the usual definition, omitting reference to sets. It has objects Ob, Mor, and 2Mor. (This fact already means this is a “multi-sorted” theory, which goes beyond what can be done with another approach to universal algebra based on monads). Funthermore, there are maps between these objects, interpreted as source, target, and identity maps of various sorts. These form diagrams, and since we’re in a finite limit theory, there must be various objects like Pairs = Mor \times_{Ob} Mor which for sets would have the interpretation “pairs of composable morphisms”. Then there’s a composition map \circ : Pairs \rightarrow Mor… and so on. In short, in describing the axioms for a bicategory in a “nice” way (i.e. in terms of arrows, commuting diagrams, etc.), we’re giving a presentation of a certain category, Th(Bicat), in generators and relations. Then a model of the theory is a functor Th(Bicat) \rightarrow \mathcal{C} – picking out a “bicategory in \mathcal{C}“.

Now, a bicategory in Sets is a bicategory. But a bicategory in Bicat is another matter. First of all, I should say there’s something kind of odd here, since Bicat is most naturally regarded as a tricategory. However, we can regard it as a category by disregarding higher morphisms and taking 2-functors only up to equivalence to make Bicat into an honest category with associative composition. Thus, if we have a functor F : Th(Bicat) \rightarrow Bicat, we have:

  • Bicategories F(Ob), latex $F(Mor)$, and F(2Mor)
  • 2-Functors F(s), F(\circ) and so on
  • satisfying conditions implied by the bicategory axioms

But each of those bicategories (in Sets!) has sets of objects, morphisms, and 2-morphisms, and one can break all the functors apart into three collections of maps acting on each of these three levels. They’ll satisfy all the conditions from the axioms – in fact, they make three new bicategories. So, for example, the object-sets of the bicategories F(Ob), F(Mor) and F(2Mor) form a bicategory using the object maps of the 2-functors F(s) and so on.

So if we say the original bicategories F(Ob) and so on are “horizontal”, and these new ones are “vertical”, we have something resembling a double category, but weak (since bicategories are weak) in both directions. The result is most naturally a four-dimensional structure (the 2-morphisms in 2Mor are most conveniently drawn as 4d, which is shown in Table 2 of the paper).

Now, the paper as it is describes all this structure without explicitly mentioning the theory Th(Bicat) except in passing – one can define “internal bicategory” without it. This is why this is a “loose end” of this paper: a major benefit of using Lawvere-style theories is the availability of morphisms of theories, which don’t come up here.

In any case, with this 4D structure in hand, what I do in the paper is (a) get some conditions that allow one to decategorify it down to Verity’s version of “double bicategory” (and even down to a bicategory); and (b) show that couble cospans are an example (double spans would do equally well, but the application is to cobordisms, which are cospans). My own reason for wanting to get down to a 2D structure is the application to extended TQFT, which means we want a 2-category of cobordisms, thought of in terms of (co)spans.

Maybe in a subsequent post I’ll talk about the example itself, but one point about internalization does occur to me. Double cospans give an example of a double bicategory in the sense above – a strict model of Th(Bicat) in Bicat. In fact, they consist of “(co)spans of (co)spans” in a way that Marco Grandis formalized in terms of powers \Lambda^n, where \Lambda is the diagram (i.e. category) \bullet \leftarrow \bullet \rightarrow \bullet. One can actually think of this in terms of internalization: these are spans in a category whose objects are spans in \mathcal{C}, and whose morphisms are triples of maps in C linking two spans (likewise for the span-map 2-morphisms). Yet it’s manifestly edge-symmetric: both the horizontal and vertical bicategories are the same.

As I mentioned in the previous post, there are lots of nice examples of double categories which are not edge-symmetric – sets, functions, and relations; or rings, homomorphisms, and bimodules, say. In fact, the second is only a pseudocategory – weak in one direction (composition of bimodules by tensor product is really only defined up to isomorphism). This is a significant thing about non-edge-symmetric examples. There’s much less motive for assuming both directions are equally strict. It’s also more natural in some ways: a pseudocategory is a weak model of Th(Cat) in Cat – equations in the theory are represented by (coherent) isomorphisms. This is the most general situation, and a strict model is a special case.

In the bicategory world, as I said, Bicat is a tricategory, so weaker models than the one I’ve given are possible – though they’re not symmetric, and so while one direction has composition and units as weak as a bicategory, the other direction will be weaker still. Robert Paré, in a conversation at MakkaiFest, suggested that a nice definition for a cubical n-category might have each direction being one step weaker than the previous one – a natural generalization of pseudocategories. Maybe there’s a way to make this seem natural in terms of internalization? One can iterate internalizing: having defined double bicategories, collect them together and find models of Th(Bicat) in DblBicat, and so forth. Maybe doing this as weakly as possible would give this tower of increasing weakness.

Now, I don’t have a great punchline to sum all this up, except that internalization seems to be an interesting lens with which to look at cubical n-categories.

It’s taken me a while to write this up, since I’ve been in the process of moving house – packing and unpacking and all the rest. However, a bit over a week ago, I was in Montreal, attending MakkaiFest ‘09 at the Centre de Recherches Mathematiques at the University of Montréal (and a pre-conference workshop hosted at McGill, which I’m including in the talks I mention here). This was in honour of the 70th birthday of Mihaly (Michael) Makkai, of McGill University. Makkai has done a lot of important foundational work in logic, model theory, and category theory, and a great many of the talks were from former students who’d gone on and been inspired by him, so one got sense of the range of things he’s worked on through his life.

The broad picture of Makkai’s work was explained to us by J.P. Marquis, from the Philosophy department at U of M. He is interested in philosophy of mathematics, and described Makkai’s project by contrast with the program of axiomatization of the early 20th century, along the lines suggested by Hilbert. This program provided a formal language for concrete structures – the problem, which category theory is part of a solution to, is to do the same for abstract structures. Contrast, for instance, the concrete description of a group G as a (particular) set with some (particular) operation, with the abstract definition of a group object in a category. Makkai’s work in categorical logic, said Marquis, is about formalizing the process of abstraction that example illustrates.

Model Theory/Logic

This matter – of the relation between abstract theories and concrete models of the theories – is really what model theory is about, and this is one of the major areas Makkai has worked on. Roughly, a theory is most basically a schema with symbols for types, members of types, and some function symbols – and a collection of sentences built using these symbols (usually generated from some axioms by rules of logical inference). A model is (intuitively), an interpretation of the terms: a way of assigning concrete data to the symbols – say, a symbol for a type is assigned the set of all entities of that type, and a function symbol is assigned an actual function between sets, and so on – making all propositions true. A morphism of models is a map that preserves all the properties of the model that can be stated using first order logic.

This is an older way to say things – Victor Harnik gave an expository talk called “Model Theory vs. Categorical Logic” in which he compared two ways of adding an equivalence relation to a theory. The model theory way (invented by Shelah) involves taking the theory (list of sentences) T and extending it to a new theory T^{eq}. This has, for instance, some new types – if we had a type for “element of group”, for example, we might then get a new type “equivalence class of elements of group”, and so on. Now, this extension is “tight” in the sense that the categories of all models of T and of T^{eq} are equivalent (by a forgetful functor Mod(T^{eq}) \rightarrow Mod(T)) – but one can prove new theorems in the extended theory. To make this clear, he described work (due to Makkai and Reyes) about pretopos completion. Here, one has the concept of a “Boolean logical category” – Set is an example, as is, for any theory, a certain category whose objects are the formulas of the theory. This is related to Lawvere theories (see below). There are logical functors between such categories – functors into Set are models, but there are also logical functors between theories. The point is that a theory T embeds into T^{eq} (abusing notation here – these are now the boolean logical categories). Then the point is that T^{eq} arises as a kind of completion of T – namely, it’s a boolean pretopos (not just category). Moreover, it has some nice universal properties, making this point of view a bit more natural than the model-theoretic construction.

Bradd Hart’s talk, “Conceptual Completeness for Cantinuous Logic”, was a bit over my head, but made some use of this kind of extension of a theory to T^{eq}. The basic point seems to be to add some kind of continuous structure to logic. One example comes from a metric structure – defining a metric space of terms, where the metric function d(x,y) is some sum \sum_n \phi_n (x,y), where the \phi_n are formulas with two variables, either true or false – where true gives a 0, and false gives a 1 in this sum. This defines a distance from x to y associated to the given list of formulas \phi_n. A continuous logic is one with a structure like this. The business about equivalence relations arises if we say two things are equivalent when the distance between them is 0 – this leads to a concept of completion, and again there’s a notion that the categories of models are equivalent (though proving it here involves some notion of approximating terms to arbitrary epsilon, which doesn’t appear in standard logic).

Anand Pillay gave a talk which used model theory to describe some properties of the free group on n generators. This involved a “theory of the free group” which applies to any free group, and regard each such group as a model of the theory – in fact a submodel of some large model, and using model-theoretic methods to examine “stability” properties, in some sense which amounts to a notion of defining “generic” subsets of the group.

Logic and Higher Categories

A number of talks specifically addressed the ground where logic meets higher dimensional categories, since Makkai has worked with both.

In one talk, Robert Paré described a way of thinking about first-order theories as examples of “double Lawvere theories”. Lawvere’s way of formalizing “theories and models” was to say that the theory is a category itself (which has just the objects needed to describe the kind of structure it’s a theory of) – and a model is a functor into Sets (or some other category – a model of the theory of groups in topological spaces, say, is a topological group). For example, the theory of groups includes an object G and powers of it, multiplication and inverse maps, and expresses the axioms by the fact that certain diagrams commute. A model is a functor M : Th(Grp) \rightarrow Sets, assigning to the “group object” a set of elements, which then get the group structure from the maps. Instead of a category, this uses a double category. There are two kinds of morphisms – horizontal and vertical – and these are used to represent two kinds of symbols: function symbols, and relation symbols. (For example, one can talk about the theory of an ordered field – so one needs symbols for multiplication and addition and so forth, but also for the order relation \leq). Then a model of such a theory is a double functor into the double category whose objects are sets, and whose horizontal and vertical morphisms are respectively functions and relations.

André Joyal gave a talk about the first order logic of higher structures. He started by commenting on some fields which began life close together, and are now gradually re-merging: logic and category theory; category theory and homotopy theory (via higher categories); homotopy theory and algebraic geometry. The higher categories Joyal was thinking of are quasicategories, or “( \infty, 1)-categories, which are simplicial sets satisfying a weak version of a horn-filling condition (the “strict” version of this, a Kan complex, includes as example N(C), the nerve of a category C – there’s an n-simplex for each sequence of n composable morphisms, whose other edges are the various composites, and whose faces are “compositors”, “associators”, and so on – which for N(C) are identities). The point of this is that one can reproduce most of category theory for quasicategories – in particular, he mentioned limits and colimits, factorization systems, pretoposes, and model theory.

Moving to quasicategories on one side of the parallel between category theory and logic has a corresponding move on the other side – on the logic side, one aspect is that the usual notion of a language is replaced by what’s called Martin-Löf type theory. This, in fact, was the subject of Michael Warren’s talk, “Martin-Löf complexes” (I reported on a similar talk he gave at Octoberfest last year). The idea here is to start by defining a globular set, given a theory and type A – a complex whose n-cells have two faces, of dimension (n-1). The 0-cells are just terms of some type A. The 1-cells are terms of types like \underline{A}(a,b), where a and b are variables of type A – the type has an interpretation as a proposition that a=b “extensionally” (i.e. not via a proof – but as for instance when two programs with non-equivalent code happen to always produce the same output). This kind of operation can be repeated to give higher cells, like \underline{A(a,b)}(f,g), and so on. Given a globular set G, one gets a theory by an adjoint construction. Putting the two together, one has a monad on the category of globular sets – algebras for the monad are Martin-Löf complexes. Throwing in syntactic rules to truncate higher cells (I suppose by declaring all cells to be identities) gives n-truncated versions of these complexes, MLC_n. Then there is some interesting homotopy theory, in that the category of n-truncated Martin-Löf complexes is expected to be a model for homotopy n-types. For example, MLC_0 is equivalent to Sets, and there is an adjunction (in fact, a Quillen equivalence – that is, a kind of “homotopy” equivalence) between MLC_1 and Gpd.

Category Theory/Higher Categories

There were a number of talks that just dealt with categories – including higher categories – in their own right. Makkai has worked, for example, on computads, which were touched on by Marek Zawadowski in one of his two talks (one in the pre-conference workshop, the other in the conference). The first was about categories of “many-to-one shapes”, which are important to computads – these are a notion of higher-category, where every cell takes many “input” faces to one “output” face. Zawadowski described a “shape” of an n-cell as an initial object in a certain category built from the category of computads with specified faces. Then there’s a category of shapes, and an abstract description of “shape” in terms of a graded tensor theory (graded for dimension, and tensor because there’s a notion of composition, I believe). Zawadowski’s second talk, “Opetopic Sets in Lax Monoidal Fibrations”, dealt with a similar topic from a different point of view. A lax monoidal fibration (LMF) is a kind of gadget for dealing with multi-level structures (categories, multicategories, quasicategories, etc). There’s a lot of stuff here I didn’t entirely follow, but just to illustrate: categories arise as LMF, by the fibration cod : Set^{B} \rightarrow Set, where B is the category with two objects M, O, and two arrows from M to O. An object in the functor category Set^{B} consists of a “set of morphisms and set of objects” with maps – making this a category involves the monoidal structure, and how composition is defined, and the real point is that this is quite general machinery.

Joachim Lambek and Gonzalo Reyez, both longtime collaborators and friends of Makkai, also both gave talks that touched on physics and categories, though in very different ways. Lambek talked about the “Lorentz category” and its appearance in special relativity.  This involves a reformulation of SR in terms of biquaternions: like complex numbers, these are of the form u + iv, but u and v are quaternions.  They have various conjugation operations, and the geometry of SR can be described in terms of their algebra (just as, say, rotations in 3D can be described in terms of quaternions).  The Lorentz category is a way of organizing this – its two objects correspond to “unconjugated” and “conjugated” states.

Gonzalo Reyez gave a derivation of General Relativity in the context of synthetic differential geometry.  The substance of this derivation is not so different from the usual one, but with one exception.  Einstein’s field equations can be derived in terms of the motions of small regions full of of freely falling test particles – synthetic differential geometry makes it possible to do the same analysis using infinitesimals rigorously all the way through.  The basic point here is that in SDG one replaces the real line as usually conceived, with a “real line with infinitesimals” (think of the ring \mathbb{R}[\epsilon]/\langle \epsilon^2 \rangle, which is like the reals, but has the infinitesimal \epsilon, whose square is zero).

Among other talks: John Power talked about the correspondence between Lawvere theories in universal algebra and finitary tree monads on sets – and asked about what happens to the left hand side of this correspondence when we replace “sets” with other categories on the righ hand side. Jeff Egger talked about measure theory from a categorical point of view – namely, the correspondence of NCG between C*-algebras and “noncommutative” topological spaces, and between W*-algebras and “noncommutative” measure spaces, thought of in terms of locales. Hongde Hu talked about the “codensity theorem”, and a way to classify certain kinds of categories – he commented on how it was inspired by Makkai’s approach to mathematics: 1) Find new proofs of old theorems, (2) standardize the concepts used in them, and (3) prove new theorems with those concepts. Fred Linton gave a talk describing Heath’s “V-space”, which is a half-plane with a funny topology whose open sets are “V” shapes, and described how the topos of locally finite sheaves over it has surprising properties having to do with nonexistence of global sections. Manoush Sadrzadeh, whom I met recently at CQC (see the bottom of the previous post) was again talking about linguistics using monoidal categories – she described some rules for “clitic movement” and changes in word order, and what these rules look like in categorical terms.

Other

A few other talks are a little harder for me to fit into the broad classification above.  There was Charles Steinhorn’s talk about ordered “o-minimal” structures, which touched on a bit of economics – essentially, a lot of economics is based on the assumption that preference orders can be made into real-valued functions, but in fact in many cases one has (variants on) “lexicographic order”, involving ranked priorities.  He talked about how typically one has a space of possibilities which can be cut up into cells, with one sort of order in each cell.  There was Julia Knight, talking about computable structures of “high Scott rank” – in particular, this is about infinite structures that can still be dealt with computably – for example, infinitary logical formulas involving an infinite number of “OR” statements where all the terms being joined are of some common form.  This ends up with an analysis of certain infinite trees.  Hal Kierstead gave a talk about Ramsey theory which I found notable because it used the kind of construction based on a game: to prove that any colouring of a graph (or hypergraph) has some property, one devises a game where one player tries to build a graph, and the other tries to colour it, and proves a winning strategy for one player.  Finally, Michael Barr gave a talk about a duality between certain categories of modules over commutative rings.

All in all, an interesting conference, with plenty of food for thought.

Barr, Kierstead, Knight, Steinhorn

Continuing from the previous post…

I realized I accidentally omitted Klaas Lansdman’s  talk on the Kochen-Specker theorem, in light of topos theory.  This overlaps a lot with the talk by Andreas Doring, although there are some significant differences.  (Having heard only what Andreas had to say about the differences, I won’t attempt to summarize them).  Again, the point of the Kochen-Specker theorem is that there isn’t a “state space” model for a quantum system – in this talk, we heard the version saying that there are no “locally sigma-Boolean” maps, from operators on a Hilbert space, to \{ 0, 1 \}.  (This is referring to sigma-algebas (of measurable sets on a space), and Boolean algebras of subsets – if there were such a map, it would be representing the system in terms of a lattice equivalent to some space).  As with the Isham/Doring approach, they then try to construct something like a state space – internal to some topos.  The main difference is that the toposes are both categories of functors into sets from some locale – but here the functors are covariant, rather than contravariant.

Now, roughly speaking, the remaining talks could be grouped into two kinds:

Quantum Foundations

Many people came to this conference from a physics-oriented point of view.  So for instance Rafael Sorkin gave a talk asking “what is a quantum reality?”. He was speaking from a “histories” interpretation of quantum systems. So, by contrast, a “classical reality” would mean one worldline: out of some space of histories, one of them happens. In quantum theory, you typically use the same space of histories, but have some kind of “path integral” or “sum over histories” when you go to compute the probabilities of given events happening. In this context, “event” means “a subset of all histories” (e.g. the subset specified by a statement like “it rained today”). So his answer to the question is: a reality should be a way of answering all questions about all events.  This is called a “coevent”.  Sorkin’s answer to “what is a quantum reality?” is: “a primitive, preclusive coevent”.

In particular, it’s a measure \mu.  For a classical system, “answering” questions means yes/no, whether the one history is in a named event – for a quantum system, it means specifying a path integral over all events – i.e. a measure on the space of events.  This measure needs some nice properties, but it’s not, for instance, a probability measure (it’s complex valued, so there can be interference effects).  Preclusion has to do with the fact that the measure of an event being zero means that it doesn’t happen – so one can make logical inferences about which events can happen.

Other talks addressing foundational problems in physics included Lucien Hardy’s: he talked about how to base predictive theories on operational structures – and put to the audience the question of whether the structures he was talking about can be represented categorically or not.  The basic idea is an “operational structure” is some collection of operations that represents a physical experiment whose outcome we might want to predict.  They have some parameters (“knob settings”), outcomes (classical “readouts”), and inputs and outputs for the things they study and affect (e.g. a machine takes in and spit out an electron, doing something in the middle).  This sort of thing can be set up as a monoidal category – but the next idea, “object-oriented operationalism”, involved components having “connections” (given relations between their inputs) and “coincidences” (predictable correlations in output).  The result was a different kind of diagram language for describing experiments, which can be put together using a “causaloid product” (he referred us to this paper, or a similar one, on this).

Robert Spekkens gave a talk about quantum theory as a probability theory – there are many parallels, though the complex amplitudes give QM phenomena like interference.  Instead of a “random variable” A, one has a Hilbert space H_A; instead of a (positive) function of A, one has a positive operator on H_A; standard things in probability have analogs in the quantum world.  What Robert Spekkens’ talk dealt with was how to think about conditional probabilities and Bayesian inference in QM.  One of the basic points is that when calculating conditional probabilities, you generally have to divide by some probability, which encounters difficulties translating into QM.  He described how to construct a “conditional density operator” along similar lines – replacing “division” by a “distortion” operation with an analogous meaning.  The whole thing deeply uses the Choi-Jamiolkowski isomorphism, a duality between “states and channels”.  In terms of the string diagrams Bob Coecke et. al. are keen on, this isomorphism can be seen as taking a special cup which creates entangled states into an ordinary cup, with an operator on one side.  (I.e. it allows the operation to be “slid off” the cup).  The talk carried this through, and ended up defining a quantum version of the probabilistic concept of “conditional independence” (i.e. events A and C are independent, given that B occurred).

A more categorical look at foundational questions was given by Rick Blute’s talk on “Categorical Structures in AQFT”, i.e. Algebraic Quantum Field Theory.  This is a formalism for QFT which takes into account the causal structure it lives on – for example, on Minkowski space, one has a causal order for points, with x \leq y if there is a future-directed null or timelike curve from x to y.  Then there’s an “interval” (more literally, a double cone) [x,y] = \{ z | x \leq z \leq y\}, and these cones form a poset under inclusion (so this is a version of the poset of subspaces of a space which keeps track of the causal structure).  Then an AQFT is a functor \mathbb{A} from this poset into C*-algebras (taking inclusions to inclusions): the idea is that each local region of space has its own algebra of observables relevant to what’s found there.  Of course, these algebras can all be pieced together (i.e. one can take a colimit of the diagram of inclusions coming from all regions on spacetime.  The result is \hat{\mathbb{A}}.  Then, one finds a category of certain representations of it on a hilbert space H (namely, “DHR” representations).  It turns out that this category is always equivalent to the representations of some group G, the gauge group of the AQFT.  Rick talked about these results, and suggested various ways to improve it – for example, by improving how one represents spacetime.

The last talk I’d attempt to shoehorn into this category was by Daniel Lehmann.  He was making an analysis of the operation “tensor product”, that is, the monoidal operation in Hilb.  For such a fundamental operation – physically, it represents taking two systems and looking at the combined system containing both – it doesn’t have a very clear abstract definition.  Lehmann presented a way of characterizing it by a universal property analogous to the universal definitions for products and coproducts.  This definition makes sense whenever there is an idea of a “bimorphism” – a thing which abstracts the properties of a “bilinear map” for vector spaces.  This seems to be closely related to the link between multicategories and monoidal categories (discussed in, for example, Tom Leinster’s book).

Categories and Logic

Some less physics-oriented and more categorical talks rounded out the part of the program that I saw.  One I might note was Mike Stay’s talk about the Rosetta Stone paper he wrote with John Baez.  The Rosetta Stone, of course, was a major archaeological find from the Ptolemaic period in Egypt – by that point, Egypt had been conquered by Alexander of Macedon and had a Greek speaking elite, but the language wasn’t widespread.  So the stone is an official pronouncement with a message in Greek, and in two written forms of Egyptian (heiroglyphic and demotic), neither of which had been readable to moderns until the stone was uncovered and correspondences could be deduced between the same message in a known language and two unknown ones.  The idea of their paper, and Mike’s talk, is to collect together analogs between four subjects: physics, topology, computation, and logic.  The idea is that each can be represented in terms of monoidal categories.  In physics, there is the category of Hilbert spaces; in topology one can look at the category of manifolds and cobordisms; in computation, there’s a monoidal category whose objects are data types, and whose morphisms are (equivalence classes) of programs taking data of one type in and returning data of another type; in logic, one has objects being propositions and morphisms being (classes) of proofs of one proposition from another.  The paper has a pretty extensive list of analogs between these domains, so go ahead and look in there for more!

Peter Selinger gave a talk about “Higher-Order Quantum Computation”.  This had to do with interesting phenomena that show up when dealing with “higher-order types” in quantum computers.  These are “data types”, as I just described – the “higher-order” types can be interpreted by blurring the distinction between a “system” and a “process”.  A data type describing a sytem we might act on might be A or B.  A higher order type like A \multimap B describes a process which takes something of type A and returns something of type B.  One could interpret this as a black box – and performing processes on a type A \multimap B is like studying that black box as a system itself.  This type is like an “internal hom” – and so one might like to say, “well, it’s dual to tensor – so it amounts to taking A^* \otimes B, since we’re in the category of Hilbert spaces”.  The trouble is, for physical computation, we’re not quite in the category where that works.  Because not all operators are significant: only some class of totally positive operators are physical.  So we don’t have the hom-tensor duality to use (equivalently, don’t have a well-behaved dual), and these types have to be considered in their own right.  And, because computations might not halt, operations studying a black box might not halt.  So in particular, a “co-co-qubit” isn’t the same as a qubit.  A co-qubit is a black box which eats a qubit and terminates with some halting probability.  A co-co-qubit eats a co-qubit and does the same.  If not for the halting probability, one could equally well see a qubit “eating” a co-co-qubit as the reverse.  But in fact they’re different.  A key fact in Peter’s talk is that quantum computation has new logical phenomena happening with types of every higher order.  Quantifying this (an open problem, apparently) would involve finding some equivalent of Bell inequalities that apply to every higher order of type.  It’s interesting to see how different quantum computing is, in not-so-obvious ways, from the classical kind.

Manoush Sadrzadeh gave a talk describing how “string diagrams” from monoidal categories, and representations of them, have been used in linguistics.  The idea is that the grammatical structure of a sentence can be build by “composing” structures associated to words – for example, a verb can be composed on left and right with subject and object to build a phrase.  She described some of the syntactic analysis that went into coming up with such a formalism.  But the interesting bit was to compare putting semantics on that syntax to taking a representation.  In particular, she described the notion of a semantic space in linguistics: this is a large-dimensional vector space that compares the meanings of words.  A rough but surprisingly effective way to clump words together by meaning just uses the statistics on a big sample of text, measuring how often they co-occur in the same context. Then there is a functor that “adds semantics” by mapping a category of string diagrams representing the syntax of sentences into one of vector spaces like this.  Applying the kind of categorical analysis usually used in logic to natural language seemed like a pretty neat idea – though it’s clear one has to make many more simplifying assumptions.

On the whole, it was a great conference with a great many interesting people to talk to – as you might guess from the fact that it took me three posts to comment on everything I wanted.

Next Page »