sheaves


Why Higher Geometric Quantization

The largest single presentation was a pair of talks on “The Motivation for Higher Geometric Quantum Field Theory” by Urs Schreiber, running to about two and a half hours, based on these notes. This was probably the clearest introduction I’ve seen so far to the motivation for the program he’s been developing for several years. Broadly, the idea is to develop a higher-categorical analog of geometric quantization (GQ for short).

One guiding idea behind this is that we should really be interested in quantization over (higher) stacks, rather than merely spaces. This leads inexorably to a higher-categorical version of GQ itself. The starting point, though, is that the defining features of stacks capture two crucial principles from physics: the gauge principle, and locality. The gauge principle means that we need to keep track not just of connections, but gauge transformations, which form respectively the objects and morphisms of a groupoid. “Locality” means that these groupoids of configurations of a physical field on spacetime is determined by its local configuration on regions as small as you like (together with information about how to glue together the data on small regions into larger regions).

Some particularly simple cases can be described globally: a scalar field gives the space of all scalar functions, namely maps into \mathbb{C}; sigma models generalise this to the space of maps \Sigma \rightarrow M for some other target space. These are determined by their values pointwise, so of course are local.

More generally, physicists think of a field theory as given by a fibre bundle V \rightarrow \Sigma (the previous examples being described by trivial bundles \pi : M \times \Sigma \rightarrow \Sigma), where the fields are sections of the bundle. Lagrangian physics is then described by a form on the jet bundle of V, i.e. the bundle whose fibre over p \in \Sigma consists of the space describing the possible first k derivatives of a section over that point.

More generally, a field theory gives a procedure F for taking some space with structure – say a (pseudo-)Riemannian manifold \Sigma – and produce a moduli space X = F(\Sigma) of fields. The Sigma models happen to be representable functors: F(\Sigma) = Maps(\Sigma,M) for some M, the representing object. A prestack is just any functor taking \Sigma to a moduli space of fields. A stack is one which has a “descent condition”, which amounts to the condition of locality: knowing values on small neighbourhoods and how to glue them together determines values on larger neighborhoods.

The Yoneda lemma says that, for reasonable notions of “space”, the category \mathbf{Spc} from which we picked target spaces M embeds into the category of stacks over \mathbf{Spc} (Riemannian manifolds, for instance) and that the embedding is faithful – so we should just think of this as a generalization of space. However, it’s a generalization we need, because gauge theories determine non-representable stacks. What’s more, the “space” of sections of one of these fibred stacks is also a stack, and this is what plays the role of the moduli space for gauge theory! For higher gauge theories, we will need higher stacks.

All of the above is the classical situation: the next issue is how to quantize such a theory. It involves a generalization of Geometric Quantization (GQ for short). Now a physicist who actually uses GQ will find this perspective weird, but it flows from just the same logic as the usual method.

In ordinary GQ, you have some classical system described by a phase space, a manifold X equipped with a pre-symplectic 2-form \omega \in \Omega^2(X). Intuitively, \omega describes how the space, locally, can be split into conjugate variables. In the phase space for a particle in n-space, these “position” and “momentum” variables, and \omega = \sum_x dx^i \wedge dp^i; many other systems have analogous conjugate variables. But what really matters is the form \omega itself, or rather its cohomology class.

Then one wants to build a Hilbert space describing the quantum analog of the system, but in fact, you need a little more than (X,\omega) to do this. The Hilbert space is a space of sections of some bundle whose sections look like copies of the complex numbers, called the “prequantum line bundle“. It needs to be equipped with a connection, whose curvature is a 2-form in the class of \omega: in general, . (If \omega is not symplectic, i.e. is degenerate, this implies there’s some symmetry on X, in which case the line bundle had better be equivariant so that physically equivalent situations correspond to the same state). The easy case is the trivial bundle, so that we get a space of functions, like L^2(X) (for some measure compatible with \omega). In general, though, this function-space picture only makes sense locally in X: this is why the choice of prequantum line bundle is important to the interpretation of the quantized theory.

Since the crucial geometric thing here is a bundle over the moduli space, when the space is a stack, and in the context of higher gauge theory, it’s natural to seek analogous constructions using higher bundles. This would involve, instead of a (pre-)symplectic 2-form \omega, an (n+1)-form called a (pre-)n-plectic form (for an introductory look at this, see Chris Rogers’ paper on the case n=2 over manifolds). This will give a higher analog of the Hilbert space.

Now, maps between Hilbert spaces in QG come from Lagrangian correspondences – these might be maps of moduli spaces, but in general they consist of a “space of trajectories” equipped with maps into a space of incoming and outgoing configurations. This is a span of pre-symplectic spaces (equipped with pre-quantum line bundles) that satisfies some nice geometric conditions which make it possible to push a section of said line bundle through the correspondence. Since each prequantum line bundle can be seen as maps out of the configuration space into a classifying space (for U(1), or in general an n-group of phases), we get a square. The action functional is a cell that fills this square (see the end of 2.1.3 in Urs’ notes). This is a diagrammatic way to describe the usual GQ construction: the advantage is that it can then be repeated in the more general setting without much change.

This much is about as far as Urs got in his talk, but the notes go further, talking about how to extend this to infinity-stacks, and how the Dold-Kan correspondence tells us nicer descriptions of what we get when linearizing – since quantization puts us into an Abelian category.

I enjoyed these talks, although they were long and Urs came out looking pretty exhausted, because while I’ve seen several others on this program, this was the first time I’ve seen it discussed from the beginning, with a lot of motivation. This was presumably because we had a physically-minded part of the audience, whereas I’ve mostly seen these for mathematicians, and usually they come in somewhere in the middle and being more time-limited miss out some of the details and the motivation. The end result made it quite a natural development. Overall, very helpful!

To continue from the previous post

Twisted Differential Cohomology

Ulrich Bunke gave a talk introducing differential cohomology theories, and Thomas Nikolaus gave one about a twisted version of such theories (unfortunately, perhaps in the wrong order). The idea here is that cohomology can give a classification of field theories, and if we don’t want the theories to be purely topological, we would need to refine this. A cohomology theory is a (contravariant) functorial way of assigning to any space X, which we take to be a manifold, a \mathbb{Z}-graded group: that is, a tower of groups of “cocycles”, one group for each n, with some coboundary maps linking them. (In some cases, the groups are also rings) For example, the group of differential forms, graded by degree.

Cohomology theories satisfy some axioms – for example, the Mayer-Vietoris sequence has to apply whenever you cut a manifold into parts. Differential cohomology relaxes one axiom, the requirement that cohomology be a homotopy invariant of X. Given a differential cohomology theory, one can impose equivalence relations on the differential cocycles to get a theory that does satisfy this axiom – so we say the finer theory is a “differential refinement” of the coarser. So, in particular, ordinary cohomology theories are classified by spectra (this is related to the Brown representability theorem), whereas the differential ones are represented by sheaves of spectra – where the constant sheaves represent the cohomology theories which happen to be homotopy invariants.

The “twisting” part of this story can be applied to either an ordinary cohomology theory, or a differential refinement of one (though this needs similarly refined “twisting” data). The idea is that, if R is a cohomology theory, it can be “twisted” over X by a map \tau: X \rightarrow Pic_R into the “Picard group” of R. This is the group of invertible R-modules (where an R-module means a module for the cohomology ring assigned to X) – essentially, tensoring with these modules is what defines the “twisting” of a cohomology element.

An example of all this is twisted differential K-theory. Here the groups are of isomorphism classes of certain vector bundles over X, and the twisting is particularly simple (the Picard group in the topological case is just \mathbb{Z}_2). The main result is that, while topological twists are classified by appropriate gerbes on X (for K-theory, U(1)-gerbes), the differential ones are classified by gerbes with connection.

Fusion Categories

Scott Morrison gave a talk about Classifying Fusion Categories, the point of which was just to collect together a bunch of results constructing particular examples. The talk opens with a quote by Rutherford: “All science is either physics or stamp collecting” – that is, either about systematizing data and finding simple principles which explain it, or about collecting lots of data. This talk was unabashed stamp-collecting, on the grounds that we just don’t have a lot of data to systematically understand yet – and for that very reason I won’t try to summarize all the results, but the slides are well worth a look-over. The point is that fusion categories are very useful in constructing TQFT’s, and there are several different constructions that begin “given a fusion category \mathcal{C}“… and yet there aren’t all that many examples, and very few large ones, known.

Scott also makes the analogy that fusion categories are “noncommutative finite groups” – which is a little confusing, since not all finite groups are commutative anyway – but the idea is that the symmetric fusion categories are exactly the representation categories of finite groups. So general fusion categories are a non-symmetric generalization of such groups. Since classifying finite groups turned out to be difficult, and involve a laundry-list of sporadic groups, it shouldn’t be too surprising that understanding fusion categories (which, for the symmetric case, include the representation categories of all these examples) should be correspondingly tricky. Since, as he points out, we don’t have very many non-symmetric examples beyond rank 12 (analogous to knowing only finite groups with at most 12 elements), it’s likely that we don’t have a very good understanding of these categories in general yet.

There were a couple of talks – one during the workshop by Sonia Natale, and one the previous week by Sebastian Burciu, whom I also had the chance to talk with that week – about “Equivariantization” of fusion categories, and some fairly detailed descriptions of what results. The two of them have a paper on this which gives more details, which I won’t summarize – but I will say a bit about the construction.

An “equivariantization” of a category C acted on by a group G is supposed to be a generalization of the notion of the set of fixed points for a group acting on a set.  The category C^G has objects which consist of an object x \in C which is fixed by the action of G, together with an isomorphism \mu_g : x \rightarrow x for each g \in G, satisfying a bunch of unsurprising conditions like being compatible with the group operation. The morphisms are maps in C between the objects, which form commuting squares for each g \in G. Their paper, and the talks, described how this works when C is a fusion category – namely, C^G is also a fusion category, and one can work out its fusion rules (i.e. monoidal structure). In some cases, it’s a “group theoretical” fusion category (it looks like Rep(H) for some group H) – or a weakened version of such a thing (it’s Morita equivalent to ).

A nice special case of this is if the group action happens to be trivial, so that every object of C is a fixed point. In this case, C^G is just the category of objects of C equipped with a G-action, and the intertwining maps between these. For example, if C = Vect, then C^G = Rep(G) (in particular, a “group-theoretical fusion category”). What’s more, this construction is functorial in G itself: given a subgroup H \subset G, we get an adjoint pair of functors between C^G and C^H, which in our special case are just the induced-representation and restricted-representation functors for that subgroup inclusion. That is, we have a Mackey functor here. These generalize, however, to any fusion category C, and to nontrivial actions of G on C. The point of their paper, then, is to give a good characterization of the categories that come out of these constructions.

Quantizing with Higher Categories

The last talk I’d like to describe was by Urs Schreiber, called Linear Homotopy Type Theory for Quantization. Urs has been giving evolving talks on this topic for some time, and it’s quite a big subject (see the long version of the notes above if there’s any doubt). However, I always try to get a handle on these talks, because it seems to be describing the most general framework that fits the general approach I use in my own work. This particular one borrows a lot from the language of logic (the “linear” in the title alludes to linear logic).

Basically, Urs’ motivation is to describe a good mathematical setting in which to construct field theories using ingredients familiar to the physics approach to “field theory”, namely… fields. (See the description of Kevin Walker’s talk.) Also, Lagrangian functionals – that is, the notion of a physical action. Constructing TQFT from modular tensor categories, for instance, is great, but the fields and the action seem to be hiding in this picture. There are many conceptual problems with field theories – like the mathematical meaning of path integrals, for instance. Part of the approach here is to find a good setting in which to locate the moduli spaces of fields (and the spaces in which path integrals are done). Then, one has to come up with a notion of quantization that makes sense in that context.

The first claim is that the category of such spaces should form a differentially cohesive infinity-topos which we’ll call \mathbb{H}. The “infinity” part means we allow morphisms between field configurations of all orders (2-morphisms, 3-morphisms, etc.). The “topos” part means that all sorts of reasonable constructions can be done – for example, pullbacks. The “differentially cohesive” part captures the sort of structure that ensures we can really treat these as spaces of the suitable kind: “cohesive” means that we have a notion of connected components around (it’s implemented by having a bunch of adjoint functors between spaces and points). The “differential” part is meant to allow for the sort of structures discussed above under “differential cohomology” – really, that we can capture geometric structure, as in gauge theories, and not just topological structure.

In this case, we take \mathbb{H} to have objects which are spectral-valued infinity-stacks on manifolds. This may be unfamiliar, but the main point is that it’s a kind of generalization of a space. Now, the sort of situation where quantization makes sense is: we have a space (i.e. \mathbb{H}-object) of field configurations to start, then a space of paths (this is WHERE “path-integrals” are defined), and a space of field configurations in the final system where we observe the result. There are maps from the space of paths to identify starting and ending points. That is, we have a span:

A \leftarrow X \rightarrow B

Now, in fact, these may all lie over some manifold, such as B^n(U(1)), the classifying space for U(1) (n-1)-gerbes. That is, we don’t just have these “spaces”, but these spaces equipped with one of those pieces of cohomological twisting data discussed up above. That enters the quantization like an action (it’s WHAT you integrate in a path integral).

Aside: To continue the parallel, quantization is playing the role of a cohomology theory, and the action is the twist. I really need to come back and complete an old post about motives, because there’s a close analogy here. If quantization is a cohomology theory, it should come by factoring through a universal one. In the world of motives, where “space” now means something like “scheme”, the target of this universal cohomology theory is a mild variation on just the category of spans I just alluded to. Then all others come from some functor out of it.

Then the issue is what quantization looks like on this sort of scenario. The Atiyah-Singer viewpoint on TQFT isn’t completely lost here: quantization should be a functor into some monoidal category. This target needs properties which allow it to capture the basic “quantum” phenomena of superposition (i.e. some additivity property), and interference (some actual linearity over \mathbb{C}). The target category Urs talked about was the category of E_{\infty}-rings. The point is that these are just algebras that live in the world of spectra, which is where our spaces already lived. The appropriate target will depend on exactly what \mathbb{H} is.

But what Urs did do was give a characterization of what the target category should be LIKE for a certain construction to work. It’s a “pull-push” construction: see the link way above on Mackey functors – restriction and induction of representations are an example . It’s what he calls a “(2-monoidal, Beck-Chevalley) Linear Homotopy-Type Theory”. Essentially, this is a list of conditions which ensure that, for the two morphisms in the span above, we have a “pull” operation for some and left and right adjoints to it (which need to be related in a nice way – the jargon here is that we must be in a Wirthmuller context), satisfying some nice relations, and that everything is functorial.

The intuition is that if we have some way of getting a “linear gadget” out of one of our configuration spaces of fields (analogous to constructing a space of functions when we do canonical quantization over, let’s say, a symplectic manifold), then we should be able to lift it (the “pull” operation) to the space of paths. Then the “push” part of the operation is where the “path integral” part comes in: many paths might contribute to the value of a function (or functor, or whatever it may be) at the end-point of those paths, because there are many ways to get from A to B, and all of them contribute in a linear way.

So, if this all seems rather abstract, that’s because the point of it is to characterize very generally what has to be available for the ideas that appear in physics notions of path-integral quantization to make sense. Many of the particulars – spectra, E_{\infty}-rings, infinity-stacks, and so on – which showed up in the example are in a sense just placeholders for anything with the right formal properties. So at the same time as it moves into seemingly very abstract terrain, this approach is also supposed to get out of the toy-model realm of TQFT, and really address the trouble in rigorously defining what’s meant by some of the standard practice of physics in field theory by analyzing the logical structure of what this practice is really saying. If it turns out to involve some unexpected math – well, given the underlying issues, it would have been more surprising if it didn’t.

It’s not clear to me how far along this road this program gets us, as far as dealing with questions an actual physicist would like to ask (for the most part, if the standard practice works as an algorithm to produce results, physicists seldom need to ask what it means in rigorous math language), but it does seem like an interesting question.

John Huerta visited here for about a week earlier this month, and gave a couple of talks. The one I want to write about here was a guest lecture in the topics course Susama Agarwala and I were teaching this past semester. The course was about topics in category theory of interest to geometry, and in the case of this lecture, “geometry” means supergeometry. It follows the approach I mentioned in the previous post about looking at sheaves as a kind of generalized space. The talk was an introduction to a program of seeing supermanifolds as a kind of sheaf on the site of “super-points”. This approach was first proposed by Albert Schwartz, though see, for instance, this review by Christophe Sachse for more about this approach, and this paper (comparing the situation for real and complex (super)manifolds) for more recent work.

It’s amazing how many geometrical techniques can be applied in quite general algebras once they’re formulated correctly. It’s perhaps less amazing for supermanifolds, in which commutativity fails in about the mildest possible way.  Essentially, the algebras in question split into bosonic and fermionic parts. Everything in the bosonic part commutes with everything, and the fermionic part commutes “up to a negative sign” within itself.

Supermanifolds

Supermanifolds are geometric objects, which were introduced as a setting on which “supersymmetric” quantum field theories could be defined. Whether or not “real” physics has this symmetry (the evidence is still pending, though ), these are quite nicely behaved theories. (Throwing in extra symmetry assumptions tends to make things nicer, and supersymmetry is in some sense the maximum extra symmetry we might reasonably hope for in a QFT).

Roughly, the idea is that supermanifolds are spaces like manifolds, but with some non-commuting coordinates. Supermanifolds are therefore in some sense “noncommutative spaces”. Noncommutative algebraic or differential geometry start with various dualities to the effect that some category of spaces is equivalent to the opposite of a corresponding category of algebras – for instance, a manifold M corresponds to the C^{\infty} algebra C^{\infty}(M,\mathbb{R}). So a generalized category of “spaces” can be found by dropping the “commutative” requirement from that statement. The category \mathbf{SMan} of supermanifolds only weakens the condition slightly: the algebras are \mathbb{Z}_2-graded, and are “supercommutative”, i.e. commute up to a sign which depends on the grading.

Now, the conventional definition of supermanifolds, as with schemes, is to say that they are spaces equipped with a “structure sheaf” which defines an appropriate class of functions. For ordinary (real) manifolds, this would be the sheaf assigning to an open set U the ring C^{\infty}(U,\mathbb{R}) of all the smooth real-valued functions. The existence of an atlas of charts for the manifold amounts to saying that the structure sheaf locally looks like C^{\infty}(V,\mathbb{R}) for some open set V \subset \mathbb{R}^p. (For fixed dimension p).

For supermanifolds, the condition on the local rings says that, for fixed dimension (p \bar q ), a p|q-dimensional supermanifold has structure sheaf in which $they look like

\mathcal{O}(\mathcal{U}) \cong C^{\infty}(V,\mathbb{R}) \otimes \Lambda_q

In this, V is as above, and the notation

\Lambda_q = \Lambda ( \theta_1, \dots , \theta_q )

refers to the exterior algebra, which we can think of as polynomials in the \theta_i, with the wedge product, which satisfies \theta_i \wedge \theta_j = - \theta_j \wedge \theta_i. The idea is that one is supposed to think of this as the algebra of smooth functions on a space with p ordinary dimensions, and q “anti-commuting” dimensions with coordinates \theta_i. The commuting variables, say x_1,\dots,x_p, are called “bosonic” or “even”, and the anticommuting ones are “fermionic” or “odd”. (The term “fermionic” is related to the fact that, in quantum mechanics, when building a Hilbert space for a bunch of identical fermions, one takes the antisymmetric part of the tensor product of their individual Hilbert spaces, so that, for instance, v_1 \otimes v_2 = - v_2 \otimes v_1).

The structure sheaf picture can therefore be thought of as giving an atlas of charts, so that the neighborhoods locally look like “super-domains”, the super-geometry equivalent of open sets V \subset \mathbb{R}^p.

In fact, there’s a long-known theorem of Batchelor which says that any real supermanifold is given exactly by the algebra of “global sections”, which looks like \mathcal{O}(M) = C^{\infty}(M_{red},\mathbb{R}) \otimes \Lambda_q. That is, sections in the local rings (“functions on” open neighborhoods of M) always glue together to give a section in \mathcal{O}(M).

Another way to put this is that every supermanifold can be seen as just bundle of exterior algebras. That is, a bundle over a base manifold M_{red}, whose fibres are the “super-points” \mathbb{R}^{0|q} corresponding to \Lambda_q. The base space M_{red} is called the “reduced” manifold. Any such bundle gives back a supermanifold, where the algebras in the structure sheaf are the algebras of sections of the bundle.

One shouldn’t be too complacent about saying they are exactly the same, though: this correspondence isn’t functorial. That is, the maps between supermanifolds are not just bundle maps. (Also, Batchelor’s theorem works only for real, not for complex, supermanifolds, where only the local neighborhoods necessarily look like such bundles).

Why, by the way, say that \mathbb{R}^{0|q} is a super “point”, when \mathbb{R}^{p|0} is a whole vector space? Since the fermionic variables are anticommuting, no term can have more than one of each \theta_i, so this is a finite-dimensional algebra. This is unlike C{\infty}(V,\mathbb{R}), which suggests that the noncommutative directions are quite different. Any element of \Lambda_q is nilpotent, so if we think of a Taylor series for some function – a power series in the (x_1,\dots,x_p,\theta_1,\dots,\theta_q) – we see note that no term has a coefficient for \theta_i greater than 1, or of degree higher than q in all the \theta_i – so imagines that only infinitesimal behaviour in these directions exists at all. Thus, a supermanifold M is like an ordinary p-dimensional manifold M_{red}, built from the ordinary domains V, equipped with a bundle whose fibres are a sort of “infinitesimal fuzz” about each point of the “even part” of the supermanifold, described by the \Lambda_q.

But this intuition is a bit vague. We can sharpen it a bit using the functor of points approach…

Supermanifolds as Manifold-Valued Sheaves

As with schemes, there is also a point of view that sees supermanifolds as “ordinary” manifolds, constructed in the topos of sheaves over a certain site. The basic insight behind the picture of these spaces, as in the previous post, is based on the fact that the Yoneda lemma lets us think of sheaves as describing all the “probes” of a generalized space (actually an algebra in this case). The “probes” are the objects of a certain category, and are called “superpoints“.

This category is just \mathbf{Spt} = \mathbf{Gr}^{op}, the opposite of the category of Grassman algebras (i.e. exterior algebras) – that is, polynomial algebras in noncommuting variables, like \Lambda(\theta_1,\dots,\theta_q). These objects naturally come with a \mathbb{Z}_2-grading, which are spanned, respectively, by the monomials with even and odd degree: \Lambda_q = latex \mathbf{SMan}$ (\Lambda_q)_0 \oplus (\Lambda_q)_1$

(\Lambda_q)_0 = span( 1, \theta_i \theta_j, \theta_{i_1}\dots\theta{i_4}, \dots )

and

(\Lambda_q)_1 = span( \theta_i, \theta_i \theta_j \theta_k, \theta_{i_1}\dots\theta_{i_5},\dots )

This is a \mathbb{Z}_2-grading since the even ones commute with anything, and the odd ones anti-commute with each other. So if f_i and f_j are homogeneous (live entirely in one grade or the other), then f_i f_j = (-1)^{deg(i)deg(j)} f_j f_i.

The \Lambda_q should be thought of as the (0|q)-dimensional supermanifold: it looks like a point, with a q-dimensional fermionic tangent space (the “infinitesimal fuzz” noted above) attached. The morphisms in \mathbf{Spt} from \Lambda_q to $llatex \Lambda_r$ are just the grade-preserving algebra homomorphisms from \Lambda_r to \Lambda_q. There are quite a few of these: these objects are not terminal objects like the actual point. But this makes them good probes. Thi gets to be a site with the trivial topology, so that all presheaves are sheaves.

Then, as usual, a presheaf M on this category is to be understood as giving, for each object A=\Lambda_q, the collection of maps from \Lambda_q to a space M. The case q=0 gives the set of points of M, and the various other algebras A give sets of “A-points”. This term is based on the analogy that a point of a topological space (or indeed element of a set) is just the same as a map from the terminal object 1, the one point space (or one element set). Then an “A-point” of a space X is just a map from another object A. If A is not terminal, this is close to the notion of a “subspace” (though a subspace, strictly, would be a monomorphism from A). These are maps from A in \mathbf{Spt} = \mathbf{Gr}^{op}, or as algebra maps, M_A consists of all the maps \mathcal{O}(M) \rightarrow A.

What’s more, since this is a functor, we have to have a system of maps between the M_A. For any algebra maps A \rightarrow A', we should get corresponding maps M_{A'} \rightarrow M_A. These are really algebra maps \Lambda_q \rightarrow \Lambda_{q'}, of which there are plenty, all determined by the images of the generators \theta_1, \dots, \theta_q.

Now, really, a sheaf on \mathbf{Spt} is actually just what we might call a “super-set”, with sets M_A for each A \in \mathbf{Spt}. To make super-manifolds, one wants to say they are “manifold-valued sheaves”. Since manifolds themselves don’t form a topos, one needs to be a bit careful about defining the extra structure which makes a set a manifold.

Thus, a supermanifold M is a manifold constructed in the topos Sh(\mathbf{Spt}). That is, M must also be equipped with a topology and a collection of charts defining the manifold structure. These are all construed internally using objects and morphisms in the category of sheaves, where charts are based on super-domains, namely those algebras which look like C^{\infty}(V) \otimes \Lambda_q, for V an open subset of \mathbb{R}^p.

The reduced manifold M_{red} which appears in Batchelor’s theorem is the manifold of ordinary points M_{\mathbb{R}}. That is, it is all the \mathbb{R}-points, where \mathbb{R} is playing the role of functions on the zero-dimensional domain with just one point. All the extra structure in an atlas of charts for all of M to make it a supermanifold amounts to putting the structure of ordinary manifolds on the M_A – but in compatible ways.

(Alternatively, we could have described \mathbf{SMan} as sheaves in Sh(\mathbf{SDom}), where \mathbf{SDom} is a site of “superdomains”, and put all the structure defining a manifold into \mathbf{SDom}. But working over super-points is preferable for the moment, since it makes it clear that manifolds and supermanifolds are just manifestations of the same basic definition, but realized in two different toposes.)

The fact that the manifold structure on the M_A must be put on them compatibly means there is a relatively nice way to picture all these spaces.

Values of the Functor of Points as Bundles

The main idea which I find helps to understand the functor of points is that, for every superpoint \mathbb{R}^{0|n} (i.e. for every Grassman algebra A=\Lambda_n), one gets a manifold M_A. (Note the convention that q is the odd dimension of M, and n is the odd dimension of the probe superpoint).

Just as every supermanifold is a bundle of superpoints, every manifold M_A is a perfectly conventional vector bundle over the conventional manifold M_{red} of ordinary points. So for each A, we get a bundle, M_A \rightarrow M_{red}.

Now this manifold, M_{red}, consists exactly of all the “points” of M – this tells us immediately that \mathbf{SMan} is not a category of concrete sheaves (in the sense I explained in the previous post). Put another way, it’s not a concrete category – that would mean that there is an underlying set functor, which gives a set for each object, and that morphisms are determined by what they do to underlying sets. Non-concrete categories are, by nature, trickier to understand.

However, the functor of points gives a way to turn the non-concrete M into a tower of concrete manifolds M_A, and the morphisms between various M amount to compatible towers of maps between the various M_A for each A. The fact that the compatibility is controlled by algebra maps \Lambda_q \rightarrow \Lambda_{q'} explains why this is the same as maps between these bundles of superpoints.

Specifically, then, we have

M_A = \{ \mathcal{O}(M) \rightarrow A \}

This splits into maps of the even parts, and of the odd parts, where the grassman algebra A = \Lambda_n has even and odd parts: A = A_0 \oplus A_1, as above. Similarly, \mathcal{O}(M) splits into odd and even parts, and since the functions on M_{red} are entirely even, this is:

( \mathcal{O}(M))_0 = C^{\infty}(M_{red}) \otimes ( \Lambda_q)_0

and

( \mathcal{O}(M))_1 = C^{\infty}(M_{red}) \otimes (\Lambda_q)_1)

Now, the duality of “hom” and tensor means that Hom(\mathcal{O}(M),A) \cong \mathcal{O}(M) \otimes A, and algebra maps preserve the grading. So we just have tensor products of these with the even and odd parts, respectively, of the probe superpoint. Since the even part A_0 includes the multiples of the constants, part of this just gives a copy of U itself. The remaining part of A_0 is nilpotent (since it’s made of even-degree polynomials in the nilpotent \theta_i, so what we end up with, looking at the bundle over an open neighborhood U \subset M_{red}, is:

U_A = U \times ( (\Lambda_q)_0 \otimes A^{nil}_0) \times ((\Lambda_q)_1 \otimes A_1)

The projection map U_A \rightarrow U is the obvious projection onto the first factor. These assemble into a bundle over M_{red}.

We should think of these bundles as “shifting up” the nilpotent part of M (which are invisible at the level of ordinary points in M_{red}) by the algebra A. Writing them this way makes it clear that this is functorial in the superpoints A = \Lambda_n: given choices n and n', and any morphism between the corresponding A and A', it’s easy to see how we get maps between these bundles.

Now, maps between supermanifolds are the same thing as natural transformations between the functors of points. These include maps of the base manifolds, along with maps between the total spaces of all these bundles. More, this tower of maps must commute with all those bundle maps coming from algebra maps A \rightarrow A'. (In particular, since A = \Lambda_0, the ordinary point, is one of these, they have to commute with the projection to M_{red}.) These conditions may be quite restrictive, but it leaves us with, at least, a quite concrete image of what maps of supermanifolds

Super-Poincaré Group

One of the main settings where super-geometry appears is in so-called “supersymmetric” field theories, which is a concept that makes sense when fields live on supermanifolds. Supersymmetry, and symmetries associated to super-Lie groups, is exactly the kind of thing that John has worked on. A super-Lie group, of course, is a supermanifold that has the structure of a group (i.e. it’s a Lie group in the topos of presheaves over the site of super-points – so the discussion above means it can be thought of as a big tower of Lie groups, all bundles over a Lie group G_{red}).

In fact, John has mostly worked with super-Lie algebras (and the connection between these and division algebras, though that’s another story). These are \mathbb{Z}_2-graded algebras with a Lie bracket whose commutation properties are the graded version of those for an ordinary Lie algebra. But part of the value of the framework above is that we can simply borrow results from Lie theory for manifolds, import it into the new topos PSh(\mathbf{Spt}), and know at once that super-Lie algebras integrate up to super-Lie groups in just the same way that happens in the old topos (of sets).

Supersymmetry refers to a particular example, namely the “super-Poincaré group”. Just as the Poincaré group is the symmetry group of Minkowski space, a 4-manifold with a certain metric on it, the super-Poincaré group has the same relation to a certain supermanifold. (There are actually a few different versions, depending on the odd dimension.) The algebra is generated by infinitesimal translations and boosts, plus some “translations” in fermionic directions, which generate the odd part of the algebra.

Now, symmetry in a quantum theory means that this algebra (or, on integration, the corresponding group) acts on the Hilbert space \mathcal{H} of possible states of the theory: that is, the space of states is actually a representation of this algebra. In fact, to make sense of this, we need a super-Hilbert space (i.e. a graded one). The even generators of the algebra then produce grade-preserving self-maps of \mathcal{H}, and the odd generators produce grade-reversing ones. (This fact that there are symmetries which flip the “bosonic” and “fermionic” parts of the total \mathcal{H} is why supersymmetric theories have “superpartners” for each particle, with the opposite parity, since particles are labelled by irreducible representations of the Poincaré group and the gauge group).

To date, so far as I know, there’s no conclusive empirical evidence that real quantum field theories actually exhibit supersymmetry, such as detecting actual super-partners for known particles. Even if not, however, it still has some use as a way of developing toy models of quite complicated theories which are more tractable than one might expect, precisely because they have lots of symmetry. It’s somewhat like how it’s much easier to study computationally difficult theories like gravity by assuming, for instance, spherical symmetry as an extra assumption. In any case, from a mathematician’s point of view, this sort of symmetry is just a particularly simple case of symmetries for theories which live on noncommutative backgrounds, which is quite an interesting topic in its own right. As usual, physics generates lots of math which remains both true and interesting whether or not it applies in the way it was originally suggested.

In any case, what the functor-of-points viewpoint suggests is that ordinary and super- symmetries are just two special cases of “symmetries of a field theory” in two different toposes. Understanding these and other examples from this point of view seems to give a different understanding of what “symmetry”, one of the most fundamental yet slippery concepts in mathematics and science, actually means.

This semester, Susama Agarwala and I have been sharing a lecture series for graduate students. (A caveat: there are lecture notes there, by student request, but they’re rough notes, and contain some mistakes, omissions, and represent a very selective view of the subject.) Being a “topics” course, it consists of a few different sections, loosely related, which revolve around the theme of categorical tools which are useful for geometry (and topology).

What this has amounted to is: I gave a half-semester worth of courses on toposes, sheaves, and the basics of derived categories. Susama is now giving the second half, which is about motives. This post will talk about the part of the course I gave. Though this was a whole series of lectures which introduced all these topics more or less carefully, I want to focus here on the part of the lecture which built up to a discussion of sheaves as spaces. Nothing here, or in the two posts to follow, is particularly new, but they do amount to a nice set of snapshots of some related ideas.

Coming up soon: John Huerta is currently visiting Hamburg, and on  July 8, he gave a guest-lecture which uses some of this machinery to talk about supermanifolds, which will be the subject of the next post in this series. In a later post, I’ll talk about Susama’s lectures about motives and how this relates to the discussion here (loosely).

Grothendieck Toposes

The first half of our course was about various aspects of Grothendieck toposes. In the first lecture, I talked about “Elementary” (or Lawvere-Tierney) toposes. One way to look at these is to say that they are categories \mathcal{E} which have all the properties of the category of Sets which make it useful for doing most of ordinary mathematics. Thus, a topos in this sense is a category with a bunch of properties – there are various equivalent definitions, but for example, toposes have all finite limits (in particular, products), and all colimits.

More particularly, they have “power objects”. That is, if A and B are objects of \mathcal{E}, then there is an object B^A, with an “evaluation map” B^A \times A \rightarrow B, which makes it possible to think of B^A as the object of “morphisms from A to B”.

The other main thing a topos has is a “subobject classifier”. Now, a subobject of A \in \mathcal{E} is an equivalence class of monomorphisms into A – think of sets, where this amounts to specifying the image, and the monomorphisms are the various inclusions which pick out the same subset as their image. A classifier for subobjects should be thought of as something like the two-element set is Sets, whose elements we can tall “true” and “false”. Then every subset of A corresponds to a characteristic function A \rightarrow \mathbf{2}. In general, a subobject classifies is an object \Omega together with a map from the terminal object, T : 1 \rightarrow \Omega, such that every inclusion of subobject is a pullback of T along a characteristic function.

Now, elementary toposes were invented chronologically later than Grothendieck toposes, which are a special class of example. These are categories of sheaves on (Grothendieck) sites. A site is a category \mathcal{T} together with a “topology” J, which is a rule which, for each U \in \mathcal{T}, picks out J(U), a set of collections of maps into U, called seives for U. They collections J(U) have to satisfy certain conditions, but the idea can be understood in terms of the basic example, \mathcal{T} = TOP(X). Given a topological space, TOP(X) is the category whose objects are the open sets U \subset X, and the morphisms are all the inclusions. Then  that each collection in J(U) is an open cover of U – that is, a bunch of inclusions of open sets, which together cover all of U in the usual sense.

(This is a little special to TOP(X), where every map is an inclusion – in a general site, the J(U) need to be closed under composition with any other morphism (like an ideal in a ring). So for instance, \mathcal{T} = Top, the category of topological spaces, the usual choice of J(U) consists of all collections of maps which are jointly surjective.)

The point is that a presheaf on \mathcal{T} is just a functor \mathcal{T}^{op} \rightarrow Sets. That is, it’s a way of assigning a set to each U \in \mathcal{T}. So, for instance, for either of the cases we just mentioned, one has B : \mathcal{T}^{op} \rightarrow Sets, which assigns to each open set U the set of all bounded functions on U, and to every inclusion the restriction map. Or, again, one has C : \mathcal{T}^{op} \rightarrow Sets, which assigns the set of all continuous functions.

These two examples illustrate the condition which distinguishes those presheaves S which are sheaves – namely, those which satisfy some “gluing” conditions. Thus, suppose we’re, given an open cover \{ f_i : U_i \rightarrow U \}, and a choice of one element x_i from each S(U_i), which form a “matching family” in the sense that they agree when restricted to any overlaps. Then the sheaf condition says that there’s a unique “amalgamation” of this family – that is, one element x \in S(U) which restricts to all the x_i under the maps S(f_i) : S(U) \rightarrow S(U_i).

Sheaves as Generalized Spaces

There are various ways of looking at sheaves, but for the purposes of the course on categorical methods in geometry, I decided to emphasize the point of view that they are a sort of generalized spaces.

The intuition here is that all the objects and morphisms in a site \mathcal{T} have corresponding objects and morphisms in Psh(\mathcal{T}). Namely, the objects appear as the representable presheaves, U \mapsto Hom(-,U), and the morphisms U \rightarrow V show up as the induced natural transformations between these functors. This map y : \mathcal{T} \rightarrow Psh(\mathcal{T}) is called the Yoneda embedding. If \mathcal{T} is at all well-behaved (as it is in all the examples we’re interested in here), these presheaves will always be sheaves: the image of y lands in Sh(\mathcal{T}).

In this case, the Yoneda embedding embeds \mathcal{T} as a sub-category of Sh(\mathcal{T}). What’s more, it’s a full subcategory: all the natural transformations between representable presheaves come from the morphisms of \mathcal{T}-objects in a unique way. So  Sh(\mathcal{T}) is, in this sense, a generalization of \mathcal{T} itself.

More precisely, it’s the Yoneda lemma which makes sense of all this. The idea is to start with the way ordinary \mathcal{T}-objects (from now on, just call them “spaces”) S become presheaves: they become functors which assign to each U the set of all maps into S. So the idea is to turn this around, and declare that even non-representable sheaves should have the same interpretation. The Yoneda Lemma makes this a sensible interpretation: it says that, for any presheaf F \in Psh(\mathcal{T}), and any U \in \mathcal{T}, the set F(U) is naturally isomorphic to Hom(y(U),F): that is, F(U) literally is the collection of morphisms from U (or rather, its image under the Yoneda embedding) and a “generalized space” F. (See also Tom Leinster’s nice discussion of the Yoneda Lemma if this isn’t familiar.) We describe U as a “probe” object: one probes the space F by mapping U into it in various ways. Knowing the results for all U \in \mathcal{T} tells you all about the “space” F. (Thus, for instance, one can get all the information about the homotopy type of a space if you know all the maps into it from spheres of all dimensions up to homotopy. So spheres are acting as “probes” to reveal things about the space.)

Furthermore, since Sh(\mathcal{T}) is a topos, it is often a nicer category than the one you start with. It has limits and colimits, for instance, which the original category might not have. For example, if the kind of spaces you want to generalize are manifolds, one doesn’t have colimits, such as the space you get by gluing together two lines at a point. The sheaf category does. Likewise, the sheaf category has exponentials, and manifolds don’t (at least not without the more involved definitions needed to allow infinite-dimensional manifolds).

These last remarks about manifolds suggest the motivation for the first example…

Diffeological Spaces

The lecture I gave about sheaves as spaces used this paper by John Baez and Alex Hoffnung about “smooth spaces” (they treat Souriau’s diffeological spaces, and the different but related Chen spaces in the same framework) to illustrate the point. They describe In that case, the objects of the sites are open (or, for Chen spaces, convex) subsets of \mathbb{R}^n, for all choices of n, the maps are the smooth maps in the usual sense (i.e. the sense to be generalized), and the covers are jointly surjective collections of maps.

Now, that example is a somewhat special situation: they talk about concrete sheaves, on concrete sites, and the resulting categories are only quasitoposes – a slightly weaker condition than being a topos, but one still gets a useful collection of spaces, which among other things include all manifolds. The “concreteness” condition – that \mathcal{T} has a terminal object to play the role of “the point”. Being a concrete sheaf then means that all the “generalized spaces” have an underlying set of points (namely, the set of maps from the point object), and that all morphisms between the spaces are completely determined by what they do to the underlying set of points. This means that the “spaces” really are just sets with some structure.

Now, if the site happens to be TOP(X), then we have a slightly intuition: the “generalized” spaces are something like generalized bundles over X, and the “probes” are now sections of such a bundle. A simple example would be an actual sheaf of functions: these are sections of a trivial bundle, since, say, \mathbb{C}-valued functions are sections of the bundle \pi: X \times \mathbb{C} \rightarrow X. Given a nontrivial bundle \pi : M \rightarrow X, there is a sheaf of sections – on each U, one gets F_M(U) to be all the one-sided inverses s : U \rightarrow M which are one-sided inverses of \pi. For a generic sheaf, we can imagine a sort of “generalized bundle” over X.

Schemes

Another example of the fact that sheaves can be seen as spaces is the category of schemes: these are often described as topological spaces which are themselves equipped with a sheaf of rings. “Scheme” is to algebraic geometry what “manifold” is to differential geometry: a kind of space which looks locally like something classical and familiar. Schemes, in some neighborhood of each point, must resemble varieties – i.e. the locus of zeroes of some algebraic function on $\mathbb{k}^n$. For varieties, the rings attached to neighborhoods are rings of algebraic functions on this locus, which will be a quotient of the ring of polynomials.

But another way to think of schemes is as concrete sheaves on a site whose objects are varieties and whose morphisms are algebraic maps. This is dual to the other point of view, just as thinking of diffeological spaces as sheaves is dual to a viewpoint in which they’re seen as topological spaces equipped with a notion of “smooth function”.

(Some general discussion of this in a talk by Victor Piercey)

Generalities

These two viewpoints (defining the structure of a space by a class of maps into it, or by a class of maps out of it) in principle give different definitions. To move between them, you really need everything to be concrete: the space has an underlying set, the set of probes is a collection of real set-functions. Likewise, for something like a scheme, you’d need the ring for any open set to be a ring of actual set-functions. In this case, one can move between the two descriptions of the space as long as there is a pre-existing concept of the right kind of function  on the “probe” spaces. Given a smooth space, say, one can define a sheaf of smooth functions on each open set by taking those whose composites with every probe are smooth. Conversely, given something like a scheme, where the structure sheaf is of function rings on each open subspace (i.e. the sheaf is representable), one can define the probes from varieties to be those which give algebraic functions when composed with every function in these rings. Neither of these will work in general: the two approaches define different categories of spaces (in the smooth context, see Andrew Stacey’s comparison of various categories of smooth spaces, defined either by specifying the smooth maps in, or out, or both). But for very concrete situations, they fit together neatly.

The concrete case is therefore nice for getting an intuition for what it means to think of sheaves as spaces. For sheaves which aren’t concrete, morphisms aren’t determined by what they do to the underlying points i.e. the forgetful “underlying set” functor isn’t faithful. Here, we might think of a “generalized space” which looks like two copies of the same topological space: the sheaf gives two different elements of F(U) for each map of underlying sets. We could think of such generalized space as built from sets equipped with extra “stuff” (say, a set consisting of pairs (x,i) \in X \times \{ blue , green \} – so it consists of a “blue” copy of X and a “green” copy of X, but the underlying set functor ignores the colouring.

Still, useful as they may be to get a first handle on this concept of sheaf as generalized space, one shouldn’t rely on these intuitions too much: if \mathcal{T} doesn’t even have a “point” object, there is no underlying set functor at all. Eventually, one simply has to get used to the idea of defining a space by the information revealed by probes.

In the next post, I’ll talk more about this in the context of John Huerta’s guest lecture, applying this idea to the category of supermanifolds, which can be seen as manifolds built internal to the topos of (pre)sheaves on a site whose objects are called “super-points”.

(Note: WordPress seems to be having some intermittent technical problem parsing my math markup in this post, so please bear with me until it, hopefully, goes away…)

As August is the month in which Portugal goes on vacation, and we had several family visitors toward the end of the summer, I haven’t posted in a while, but the term has now started up at IST, and seminars are underway, so there should be some interesting stuff coming up to talk about.

New Blog

First, I’ll point out that that Derek Wise has started a new blog, called simply “Simplicity“, which is (I imagine) what it aims to contain: things which seem complex explained so as to reveal their simplicity.  Unless I’m reading too much into the title.  As of this writing, he’s posted only one entry, but a lengthy one that gives a nice explanation of a program for categorified Klein geometries which he’s been thinking a bunch about.  Klein’s program for describing the geometry of homogeneous spaces (such as spherical, Euclidean, and hyperbolic spaces with constant curvature, for example) was developed at Erlangen, and goes by the name “The Erlangen Program”.  Since Derek is now doing a postdoc at Erlangen, and this is supposed to be a categorification of Klein’s approach, he’s referred to it the “2-Erlangen Program”.  There’s more discussion about it in a (somewhat) recent post by John Baez at the n-Category Cafe.  Both of them note the recent draft paper they did relating a higher gauge theory based on the Poincare 2-group to a theory known as teleparallel gravity.  I don’t know this theory so well, except that it’s some almost-equivalent way of formulating General Relativity

I’ll refer you to Derek’s own post for full details of what’s going on in this approach, but the basic motivation isn’t too hard to set out.  The Erlangen program takes the view that a homogeneous space is a space X (let’s say we mean by this a topological space) which “looks the same everywhere”.  More precisely, there’s a group action by some G, which we understand to be “symmetries” of the space, which is transitive.  Since every point is taken to every other point by some symmetry, the space is “homogeneous”.  Some symmetries leave certain points x \in X where they are – they form the stabilizer subgroup H = Stab(x).  When the space is homogeneous, it is isomorphic to the coset space, X \cong G / H.  So Klein’s idea is to say that any time you have a Lie group G and a closed subgroup H < G, this quotient will be called a “homogeneous space”.  A familiar example would be Euclidean space, \mathbb{R}^n \cong E(n) / O(n), where E is the Euclidean group and O is the orthogonal group, but there are plenty of others.

This example indicates what Cartan geometry is all about, though – this is the next natural step after Klein geometry (Edit:  Derek’s blog now has a visual explanation of Cartan geometry, a.k.a. “generalized hamsterology”, new since I originally posted this).  We can say that Cartan is to Klein as Riemann is to Euclid.  (Or that Cartan is to Riemann as Klein is to Euclid – or if you want to get maybe too-precisely metaphorical, Cartan is the pushout of Klein and Riemann over Euclid).  The point is that Riemannian geometry studies manifolds – spaces which are not homogeneous, but look like Euclidean space locally.  Cartan geometry studies spaces which aren’t homogeneous, but can be locally modelled by Klein geometries.  Now, a Riemannian geometry is essentially a manifold with a metric, describing how it locally looks like Euclidean space.  An equivalent way to talk about it is a manifold with a bundle of Euclidean spaces (the tangent spaces) with a connection (the Levi-Civita connection associated to the metric).  A Cartan geometry can likewise be described as a G-bundle with fibre X with a connection

Then the point of the “2-Erlangen program” is to develop similar geometric machinery for 2-groups (a.k.a. categorical groups).  This is, as usual, a bit more complicated since actions of 2-groups are trickier than group-actions.  In their paper, though, the point is to look at spaces which are locally modelled by some sort of 2-Klein geometry which derives from the Poincare 2-group.  By analogy with Cartan geometry, one can talk about such Poincare 2-group connections on a space – that is, some kind of “higher gauge theory”.  This is the sort of framework where John and Derek’s draft paper formulates teleparallel gravity.  It turns out that the 2-group connection ends up looking like a regular connection with torsion, and this plays a role in that theory.  Their draft will give you a lot more detail.

Talk on Manifold Calculus

On a different note, one of the first talks I went to so far this semester was one by Pedro Brito about “Manifold Calculus and Operads” (though he ran out of time in the seminar before getting to talk about the connection to operads).  This was about motivating and introducing the Goodwillie Calculus for functors between categories of spaces.  (There are various references on this, but see for instance these notes by Hal Sadofsky). In some sense this is a generalization of calculus from functions to functors, and one of the main results Goodwillie introduced with this subject, is a functorial analog of Taylor’s theorem.  I’d seen some of this before, but this talk was a nice and accessible intro to the topic.

So the starting point for this “Manifold Calculus” is that we’d like to study functors from spaces to spaces (in fact this all applies to spectra, which are more general, but Pedro Brito’s talk was focused on spaces).  The sort of thing we’re talking about is a functor which, given a space M, gives a moduli space of some sort of geometric structures we can put on M, or of mappings from M.  The main motivating example he gave was the functor

Imm(-,N) : [Spaces] \rightarrow [Spaces]

for some fixed manifold N. Given a manifold M, this gives the mapping space of all immersions of M into N.

(Recalling some terminology: immersions are maps of manifolds where the differential is nondegenerate – the induced map of tangent spaces is everywhere injective, meaning essentially that there are no points, cusps, or kinks in the image, but there might be self-intersections. Embeddings are, in addition, local homeomorphisms.)

Studying this functor Imm(-,N) means, among other things, looking at the various spaces Imm(M,N) of immersions of each M into N. We might first ask: can M be immersed in N at all – in other words, is \pi_0(Imm(M,N)) nonempty?

So, for example, the Whitney Embedding Theorem says that if dim(N) is at least 2 dim(M), then there is an embedding of M into N (which is therefore also an immersion).

In more detail, we might want to know what \pi_0(Imm(M,N)) is, which tells how many connected components of immersions there are: in other words, distinct classes of immersions which can’t be deformed into one another by a family of immersions. Or, indeed, we might ask about all the homotopy groups of Imm(M,N), not just the zeroth: what’s the homotopy type of Imm(M,N)? (Once we have a handle on this, we would then want to vary M).

It turns out this question is manageable, party due to a theorem of Smale and Hirsch, which is a generalization of Gromov’s h-principle – the original principle applies to solutions of certain kinds of PDE’s, saying that any solution can be deformed to a holomorphic one, so if you want to study the space of solutions up to homotopy, you may as well just study the holomorphic solutions.

The Smale-Hirsch theorem likewise gives a homotopy equivalence of two spaces, one of which is Imm(M,N). The other is the space of “formal immersions”, called Imm^f(M,N). It consists of all (f,F), where f : M \rightarrow N is smooth, and F : TM \rightarrow TN is a map of tangent spaces which restricts to f, and is injective. These are “formally” like immersions, and indeed Imm(M,N) has an inclusion into Imm^f(M,N), which happens to be a homotopy equivalence: it induces isomorphisms of all the homotopy groups. These come from homotopies taking each “formal immersion” to some actual immersion. So we’ve approximated Imm(-,N), up to homotopy, by Imm^f(-,N). (This “homotopy” of functors makes sense because we’re talking about an enriched functor – the source and target categories are enriched in spaces, where the concepts of homotopy theory are all available).

We still haven’t got to manifold calculus, but it will be all about approximating one functor by another – or rather, by a chain of functors which are supposed to be like the Taylor series for a function. The way to get this series has to do with sheafification, so first it’s handy to re-describe what the Smale-Hirsch theorem says in terms of sheaves. This means we want to talk about some category of spaces with a Grothendieck topology.

So lets let \mathcal{E} be the category whose objects are d-dimensional manifolds and whose morphisms are embeddings (which, of course, are necessarily codimension 0). Now, the point here is that if f : M \rightarrow M' is an embedding in \mathcal{E}, and M' has an immersion into N, this induces an immersion of M into N. This amounst to saying Imm(-,N) is a contravariant functor:

Imm(-,N) : \mathcal{E}^{op} \rightarrow [Spaces]

That makes Imm(-,N) a presheaf. What the Smale-Hirsch theorem tells us is that this presheaf is a homotopy sheaf – but to understand that, we need a few things first.

First, what’s a homotopy sheaf? Well, the condition for a sheaf says that if we have an open cover of M, then

So to say how Imm(-,N) : \mathcal{E}^{op} \rightarrow [Spaces] is a homotopy sheaf, we have to give \mathcal{E} a topology, which means defining a “cover”, which we do in the obvious way – a cover is a collection of morphisms f_i : U_i \rightarrow M such that the union of all the images \cup f_i(U_i) is just M. The topology where this is the definition of a cover can be called J_1, because it has the property that given any open cover and choice of 1 point in M, that point will be in some U_i of the cover.

This is part of a family of topologies, where J_k only allows those covers with the property that given any choice of k points in M, some open set of the cover contains them all. These conditions, clearly, get increasingly restrictive, so we have a sequence of inclusions (a “filtration”):

J_1 \leftarrow J_2 \leftarrow J_3 \leftarrow \dots

Now, with respect to any given one of these topologies J_k, we have the usual situation relating sheaves and presheaves.  Sheaves are defined relative to a given topology (i.e. a notion of cover).  A presheaf on \mathcal{E} is just a contravariant functor from \mathcal{E} (in this case valued in spaces); a sheaf is one which satisfies a descent condition (I’ve discussed this before, for instance here, when I was running the Stacks Seminar at UWO).  The point of a descent condition, for a given topology is that if we can take the values of a functor F “locally” – on the various objects of a cover for M – and “glue” them to find the value for M itself.  In particular, given a cover for M \in \mathcal{E}, and a cover, there’s a diagram consisting of the inclusions of all the double-overlaps of sets in the cover into the original sets.  Then the descent condition for sheaves of spaces is that

The general fact is that there’s a reflective inclusion of sheaves into presheaves (see some discussion about reflective inclusions, also in an earlier post).  Any sheaf is a contravariant functor – this is the inclusion of Sh( \mathcal{E} ) into $latex PSh( \mathcal{E} )$.  The reflection has a left adjoint, sheafification, which takes any presheaf in PSh( \mathcal{E} ) to a sheaf which is the “best approximation” to it.  It’s the fact this is an adjoint which makes the inclusion “reflective”, and provides the sense in which the sheafification is an approximation to the original functor.

The way sheafification works can be worked out from the fact that it’s an adjoint to the inclusion, but it also has a fairly concrete description.  Given any one of the topologies J_k,  we have a whole collection of special diagrams, such as:

U_i \leftarrow U_{ij} \rightarrow U_j

(using the usual notation where U_{ij} = U_i \cap U_j is the intersection of two sets in a cover, and the maps here are the inclusions of that intersection).  This and the various other diagrams involving these inclusions are special, given the topology J_k.  The descent condition for a sheaf F says that if we take the image of this diagram:

F(U_i) \rightarrow F(U_{ij}) \leftarrow F(U_j)

then we can “glue together” the objects F(U_i) and F(U_j) on the overlap to get one on the union.  That is, F is a sheaf if F(U_i \cup U_j) is a colimit of the diagram above (intuitively, by “gluing on the overlap”).  In a presheaf, it would come equipped with some maps into the F(U_i) and F(U_j): in a sheaf, this object and the maps satisfy some universal property.  Sheafification takes a presheaf F to a sheaf F^{(k)} which does this, essentially by taking all these colimits.  More accurately, since these sheaves are valued in spaces, what we really want are homotopy sheaves, where we can replace “colimit” with “homotopy colimit” in the above – which satisfies a universal property only up to homotopy, and which has a slightly weaker notion of “gluing”.   This (homotopy) sheaf is called F^{(k)} because it depends on the topology J_k which we were using to get the class of special diagrams.

One way to think about F^{(k)} is that we take the restriction to manifolds which are made by pasting together at most k open balls.  Then, knowing only this part of the functor F, we extend it back to all manifolds by a Kan extension (this is the technical sense in which it’s a “best approximation”).

Now the point of all this is that we’re building a tower of functors that are “approximately” like F, agreeing on ever-more-complicated manifolds, which in our motivating example is F = Imm(-,N).  Whichever functor we use, we get a tower of functors connected by natural transformations:

F^{(1)} \leftarrow F^{(2)} \leftarrow F^{(3)} \leftarrow \dots

This happens because we had that chain of inclusions of the topologies J_k.  Now the idea is that if we start with a reasonably nice functor (like F = Imm(-,N) for example), then F is just the limit of this diagram.  That is, it’s the universal thing F which has a map into each F^{(k)} commuting with all these connecting maps in the tower.  The tower of approximations – along with its limit (as a diagram in the category of functors) – is what Goodwillie called the “Taylor tower” for F.  Then we say the functor F is analytic if it’s just (up to homotopy!) the limit of this tower.

By analogy, think of an inclusion of a vector space V with inner product into another such space W which has higher dimension.  Then there’s an orthogonal projection onto the smaller space, which is an adjoint (as a map of inner product spaces) to the inclusion – so these are like our reflective inclusions.  So the smaller space can “reflect” the bigger one, while not being able to capture anything in the orthogonal complement.  Now suppose we have a tower of inclusions V \leftarrow V' \leftarrow V'' \dots, where each space is of higher dimension, such that each of the V is included into W in a way that agrees with their maps to each other.  Then given a vector w \in W, we can take a sequence of approximations (v,v',v'',\dots) in the V spaces.  If w was “nice” to begin with, this series of approximations will eventually at least converge to it – but it may be that our tower of V spaces doesn’t let us approximate every w in this way.

That’s precisely what one does in calculus with Taylor series: we have a big vector space W of smooth functions, and a tower of spaces we use to approximate.  These are polynomial functions of different degrees: first linear, then quadratic, and so forth.  The approximations to a function f are orthogonal projections onto these smaller spaces.  The sequence of approximations, or rather its limit (as a sequence in the inner product space W), is just what we mean by a “Taylor series for f“.  If f is analytic in the first place, then this sequence will converge to it.

The same sort of phenomenon is happening with the Goodwillie calculus for functors: our tower of sheafifications of some functor F are just “projections” onto smaller categories (of sheaves) inside the category of all contravariant functors.  (Actually, “reflections”, via the reflective inclusions of the sheaf categories for each of the topologies J_k).  The Taylor Tower for this functor is just like the Taylor series approximating a function.  Indeed, this analogy is fairly close, since the topologies J_k will give approximations of F which are in some sense based on k points (so-called k-excisive functors, which in our terminology here are sheaves in these topologies).  Likewise, a degree-k polynomial approximation approximates a smooth function, in general in a way that can be made to agree at k points.

Finally, I’ll point out that I mentioned that the Goodwillie calculus is actually more general than this, and applies not only to spaces but to spectra. The point is that the functor Imm(-,N) defines a kind of generalized cohomology theory – the cohomology groups for M are the \pi_i(Imm(M,N)). So the point is, functors satisfying the axioms of a generalized cohomology theory are represented by spectra, whereas N here is a special case that happens to be a space.

Lots of geometric problems can be thought of as classified by this sort of functor – if N = BG, the classifying space of a group, and we drop the requirement that the map be an immersion, then we’re looking at the functor that gives the moduli space of G-connections on each M.  The point is that the Goodwillie calculus gives a sense in which we can understand such functors by simpler approximations to them.

So Dan Christensen, who used to be my supervisor while I was a postdoc at the University of Western Ontario, came to Lisbon last week and gave a talk about a topic I remember hearing about while I was there.  This is the category Diff of diffeological spaces as a setting for homotopy theory.  Just to make things scan more nicely, I’m going to say “smooth space” for “diffeological space” here, although this term is in fact ambiguous (see Andrew Stacey’s “Comparative Smootheology” for lots of details about options).  There’s a lot of information about Diff in Patrick Iglesias-Zimmour’s draft-of-a-book.

Motivation

The point of the category Diff, initially, is that it extends the category of manifolds while having some nicer properties.  Thus, while all manifolds are smooth spaces, there are others, which allow Diff to be closed under various operations.  These would include taking limits and colimits: for instance, any subset of a smooth space becomes a smooth space, and any quotient of a smooth space by an equivalence relation is a smooth space.  Then too, Diff has exponentials (that is, if A and B are smooth spaces, so is A^B = Hom(B,A)).

So, for instance, this is a good context for constructing loop spaces: a manifold M is a smooth space, and so is its loop space LM = M^{S^1} = Hom(S^1,M), the space of all maps of the circle into M.  This becomes important for talking about things like higher cohomology, gerbes, etc.  When starting with the category of manifolds, doing this requires you to go off and define infinite dimensional manifolds before LM can even be defined.  Likewise, the irrational torus is hard to talk about as a manifold: you take a torus, thought of as \mathbb{R}^2 / \mathbb{Z}^2.  Then take a direction in \mathbb{R}^2 with irrational slope, and identify any two points which are translates of each other in \mathbb{R}^2 along the direction of this line.  The orbit of any point is then dense in the torus, so this is a very nasty space, certainly not a manifold.  But it’s a perfectly good smooth space.

Well, these examples motivate the kinds of things these nice categorical properties allow us to do, but Diff wouldn’t deserve to be called a category of “smooth spaces” (Souriau’s original name for them) if they didn’t allow a notion of smooth maps, which is the basis for most of what we do with manifolds: smooth paths, derivatives of curves, vector fields, differential forms, smooth cohomology, smooth bundles, and the rest of the apparatus of differential geometry.  As with manifolds, this notion of smooth map ought to get along with the usual notion for \mathbb{R}^n in some sense.

Smooth Spaces

Thus, a smooth (i.e. diffeological) space consists of:

  • A set X (of “points”)
  • A set \{ f : U \rightarrow X \} (of “plots”) for every n and open U \subset \mathbb{R}^n such that:
  1. All constant maps are plots
  2. If f: U \rightarrow X is a plot, and g : V \rightarrow U is a smooth map, f \circ g : V \rightarrow X is a plot
  3. If \{ g_i : U_i \rightarrow U\} is an open cover of U, and f : U \rightarrow X is a map, whose restrictions f \circ g_i : U_i \rightarrow X are all plots, so is f

A smooth map between smooth spaces is one that gets along with all this structure (i.e. the composite with every plot is also a plot).

These conditions mean that smooth maps agree with the usual notion in \mathbb{R}^n, and we can glue together smooth spaces to produce new ones.  A manifold becomes a smooth space by taking all the usual smooth maps to be plots: it’s a full subcategory (we introduce new objects which aren’t manifolds, but no new morphisms between manifolds).  A choice of a set of plots for some space X is a “diffeology”: there can, of course, be many different diffeologies on a given space.

So, in particular, diffeologies can encode a little more than the charts of a manifold.  Just for one example, a diffeology can have “stop signs”, as Dan put it – points with the property that any smooth map from I= [0,1] which passes through them must stop at that point (have derivative zero – or higher derivatives, if you like).  Along the same lines, there’s a nonstandard diffeology on I itself with the property that any smooth map from this I into a manifold M must have all derivatives zero at the endpoints.  This is a better object for defining smooth fundamental groups: you can concatenate these paths at will and they’re guaranteed to be smooth.

As a Quasitopos

An important fact about these smooth spaces is that they are concrete sheaves (i.e. sheaves with underlying sets) on the concrete site (i.e. a Grothendieck site where objects have underlying sets) whose objects are the U \subset \mathbb{R}^n.  This implies many nice things about the category Diff.  One is that it’s a quasitopos.  This is almost the same as a topos (in particular, it has limits, colimits, etc. as described above), but where a topos has a “subobject classifier”, a quasitopos has a weak subobject classifier (which, perhaps confusingly, is “weak” because it only classifies the strong subobjects).

So remember that a subobject classifier is an object with a map t : 1 \rightarrow \Omega from the terminal object, so that any monomorphism (subobject) A \rightarrow X is the pullback of t along some map X \rightarrow \Omega (the classifying map).  In the topos of sets, this is just the inclusion of a one-element set \{\star\} into a two-element set \{T,F\}: the classifying map for a subset A \subset X sends everything in A (i.e. in the image of the inclusion map) to T = Im(t), and everything else to F.  (That is, it’s the characteristic function.)  So pulling back T

Any topos has one of these – in particular the topos of sheaves on the diffeological site has one.  But Diff consists of the concrete sheaves, not all sheaves.  The subobject classifier of the topos won’t be concrete – but it does have a “concretification”, which turns out to be the weak subobject classifier.  The subobjects of a smooth space X which it classifies (i.e. for which there’s a classifying map as above) are exactly the subsets A \subset X equipped with the subspace diffeology.  (Which is defined in the obvious way: the plots are the plots of X which land in A).

We’ll come back to this quasitopos shortly.  The main point is that Dan and his graduate student, Enxin Wu, have been trying to define a different kind of structure on Diff.  We know it’s good for doing differential geometry.  The hope is that it’s also good for doing homotopy theory.

As a Model Category

The basic idea here is pretty well supported: naively, one can do a lot of the things done in homotopy theory in Diff: to start with, one can define the “smooth homotopy groups” \pi_n^s(X;x_0) of a pointed space.  It’s a theorem by Dan and Enxin that several possible ways of doing this are equivalent.  But, for example, Iglesias-Zimmour defines them inductively, so that \pi_0^s(X) is the set of path-components of X, and \pi_k^s(X) = \pi_{k-1}^s(LX) is defined recursively using loop spaces, mentioned above.  The point is that this all works in Diff much as for topological spaces.

In particular, there are analogs for the \pi_k^s for standard theorems like the long exact sequence of homotopy groups for a bundle.  Of course, you have to define “bundle” in Diff – it’s a smooth surjective map X \rightarrow Y, but saying a diffeological bundle is “locally trivial” doesn’t mean “over open neighborhoods”, but “under pullback along any plot”.  (Either of these converts a bundle over a whole space into a bundle over part of \mathbb{R}^n, where things are easy to define).

Less naively, the kind of category where homotopy theory works is a model category (see also here).  So the project Dan and Enxin have been working on is to give Diff this sort of structure.  While there are technicalities behind those links, the essential point is that this means you have a closed category (i.e. with all limits and colimits, which Diff does), on which you’ve defined three classes of morphisms: fibrations, cofibrations, and weak equivalences.  These are supposed to abstract the properties of maps in the homotopy theory of topological spaces – in that case weak equivalences being maps that induce isomorphisms of homotopy groups, the other two being defined by having some lifting properties (i.e. you can lift a homotopy, such as a path, along a fibration).

So to abstract the situation in Top, these classes have to satisfy some axioms (including an abstract form of the lifting properties).  There are slightly different formulations, but for instance, the “2 of 3″ axiom says that if two of f, latex $g$ and f \circ g are weak equivalences, so is the third.  Or, again, there should be a factorization for any morphism into a fibration and an acyclic cofibration (i.e. one which is also a weak equivalence), and also vice versa (that is, moving the adjective “acyclic” to the fibration).  Defining some classes of maps isn’t hard, but it tends to be that proving they satisfy all the axioms IS hard.

Supposing you could do it, though, you have things like the homotopy category (where you formally allow all weak equivalences to have inverses), derived functors(which come from a situation where homotopy theory is “modelled” by categories of chain complexes), and various other fairly powerful tools.  Doing this in Diff would make it possible to use these things in a setting that supports differential geometry.  In particular, you’d have a lot of high-powered machinery that you could apply to prove things about manifolds, even though it doesn’t work in the category Man itself – only in the larger setting Diff.

Dan and Enxin are still working on nailing down some of the proofs, but it appears to be working.  Their strategy is based on the principle that, for purposes of homotopy, topological spaces act like simplicial complexes.  So they define an affine “simplex”, \mathbb{A}^n = \{ (x_0, x_1, \dots, x_n) \in \mathbb{R}^{n+1} | \sum x_i = 1 \}.  These aren’t literally simplexes: they’re affine planes, which we understand as smooth spaces – with the subspace diffeology from \mathbb{R}^{n+1}.  But they behave like simplexes: there are face and degeneracy maps for them, and the like.  They form a “cosimplicial object”, which we can think of as a functor \Delta \rightarrow Diff, where \Delta is the simplex category).

Then the point is one can look at, for a smooth space X, the smooth singular simplicial set S(X): it’s a simplicial set where the sets are sets of smooth maps from the affine simplex into X.  Likewise, for a simplicial set S, there’s a smooth space, the “geometric realization” |S|.  These give two functors |\cdot | and S, which are adjoints (| \cdot | is the left adjoint).  And then, weak equivalences and fibrations being defined in simplicial sets (w.e. are homotopy equivalences of the realization in Top, and fibrations are “Kan fibrations”), you can just pull the definition back to Diff: a smooth map is a w.e. if its image under S is one.  The cofibrations get indirectly defined via the lifting properties they need to have relative to the other two classes.

So it’s still not completely settled that this definition actually gives a model category structure, but it’s pretty close.  Certainly, some things are known.  For instance, Enxin Wu showed that if you have a fibrant object X (i.e. one where the unique map to the terminal object is a fibration – these are generally the “good” objects to define homotopy groups on), then the smooth homotopy groups agree with the simplicial ones for S(X).  This implies that for these objects, the weak equivalences are exactly the smooth maps that give isomorphisms for homotopy groups.  And so forth.  But notice that even some fairly nice objects aren’t fibrant: two lines glued together at a point isn’t, for instance.

There are various further results.  One, a consquences of a result Enxin proved, is that all manifolds are fibrant objects, where these nice properties apply.  It’s interesting that this comes from the fact that, in Diff, every (connected) manifold is a homogeneous space.  These are quotients of smooth groups, G/H – the space is a space of cosets, and H is understood to be the stabilizer of the point.  Usually one thinks of homogenous spaces as fairly rigid things: the Euclidean plane, say, where G is the whole Euclidean group, and H the rotations; or a sphere, where G is all n-dimensional rotations, and H the ones that fix some point on the sphere.  (Actually, this gives a projective plane, since opposite points on the sphere get identified.  But you get the idea).  But that’s for Lie groups.  The point is that G = Diff(M,M), the space of diffeomorphisms from M to itself, is a perfectly good smooth group.  Then the subgroup H of diffeomorphisms that fix any point is a fine smooth subgroup, and G/H is a homogeneous space in Diff.  But that’s just M, with G acting transitively on it – any point can be taken anywhere on M.

Cohesive Infinity-Toposes

One further thing I’d mention here is related to a related but more abstract approach to the question of how to incorporate homotopy-theoretic tools with a setting that supports differential geometry.  This is the notion of a cohesive topos, and more generally of a cohesive infinity-topos.  Urs Schreiber has advocated for this approach, for instance.  It doesn’t really conflict with the kind of thing Dan was talking about, but it gives a setting for it with lot of abstract machinery.  I won’t try to explain the details (which anyway I’m not familiar with), but just enough to suggest how the two seem to me to fit together, after discussing it a bit with Dan.

The idea of a cohesive topos seems to start with Bill Lawvere, and it’s supposed to characterize something about those categories which are really “categories of spaces” the way Top is.  Intuitively, spaces consist of “points”, which are held together in lumps we could call “pieces”.  Hence “cohesion”: the points of a typical space cohere together, rather than being a dust of separate elements.  When that happens, in a discrete space, we just say that each piece happens to have just one point in it – but a priori we distinguish the two ideas.  So we might normally say that Top has an “underlying set” functor U : Top \rightarrow Set, and its left adjoint, the “discrete space” functor Disc: Set \rightarrow Top (left adjoint since set maps from S are the same as continuous maps from Disc(S) – it’s easy for maps out of Disc(S) to be continuous, since every subset is open).

In fact, any topos of sheaves on some site has a pair of functors like this (where U becomes \Gamma, the “set of global sections” functor), essentially because Set is the topos of sheaves on a single point, and there’s a terminal map from any site into the point.  So this adjoint pair is the “terminal geometric morphism” into Set.

But this omits there are a couple of other things that apply to Top: U has a right adjoint, Codisc: Set \rightarrow Top, where Codisc(S) has only S and \emptyset as its open sets.  In Codisc(S), all the points are “stuck together” in one piece.  On the other hand, Disc itself has a left adjoint, \Pi_0: Top \rightarrow Set, which gives the set of connected components of a space.  \Pi_0(X) is another kind of “underlying set” of a space.  So we call a topos \mathcal{E} “cohesive” when the terminal geometric morphism extends to a chain of four adjoint functors in just this way, which satisfy a few properties that characterize what’s happening here.  (We can talk about “cohesive sites”, where this happens.)

Now Diff isn’t exactly a category of sheaves on a site: it’s the category of concrete sheaves on a (concrete) site.  There is a cohesive topos of all sheaves on the diffeological site.  (What’s more, it’s known to have a model category structure).  But now, it’s a fact that any cohesive topos \mathcal{E} has a subcategory of concrete objects (ones where the canonical unit map X \rightarrow Codisc(\Gamma(X)) is mono: roughly, we can characterize the morphisms of X by what they do to its points).  This category is always a quasitopos (and it’s a reflective subcategory of \mathcal{E}: see the previous post for some comments about reflective subcategories if interested…)  This is where Diff fits in here.  Diffeologies define a “cohesion” just as topologies do: points are in the same “piece” if there’s some plot from a connected part of \mathbb{R}^n that lands on both.  Why is Diff only a quasitopos?  Because in general, the subobject classifier in \mathcal{E} isn’t concrete – but it will have a “concretification”, which is the weak subobject classifier I mentioned above.

Where the “infinity” part of “infinity-topos” comes in is the connection to homotopy theory.  Here, we replace the topos Sets with the infinity-topos of infinity-groupoids.  Then the “underlying” functor captures not just the set of points of a space X, but its whole fundamental infinity-groupoid.  Its objects are points of X, its morphisms are paths, 2-morphisms are homotopies of paths, and so on.  All the homotopy groups of X live here.  So a cohesive inifinity-topos is defined much like above, but with \infty-Gpd playing the role of Set, and with that \Pi_0 functor replaced by \Pi, something which, implicitly, gives all the homotopy groups of X.  We might look for cohesive infinity-toposes to be given by the (infinity)-categories of simplicial sheaves on cohesive sites.

This raises a point Dan made in his talk over the diffeological site D, we can talk about a cube of different structures that live over it, starting with presheaves: PSh(D).  We can add different modifiers to this: the sheaf condition; the adjective “concrete”; the adjective “simplicial”.  Various combinations of these adjectives (e.g. simplicial presheaves) are known to have a model structure.  Diff is the case where we have concrete sheaves on D.  So far, it hasn’t been proved, but it looks like it shortly will be, that this has a model structure.  This is a particularly nice one, because these things really do seem a lot like spaces: they’re just sets with some easy-to-define and well-behaved (that’s what the sheaf condition does) structure on them, and they include all the examples a differential geometer requires, the manifolds.

Last week I spoke in Montreal at a session of the Philosophy of Science Association meeting.  Here are some notes for it.  Later on I’ll do a post about the other talks at the meeting.

Right now, though, the meeting slowed me down from describing a recent talk in the seminar here at IST.  This was Gonçalo Rodrigues’ talk on categorifying measure theory.  It was based on this paper here, which is pretty long and goes into some (but not all) of the details.  Apparently an updated version that fills in some of what’s not there is in the works.

In any case, Gonçalo takes as the starting point for categorifying ideas in analysis the paper “Measurable Categories” by David Yetter, which is the same point where I started on this topic, although he then concludes that there are problems with that approach.  Part of the reason for saying this has to do with the fact that the category of Hilbert spaces has many bad properties – or rather, fails to have many of the good ones that it should to play the role one might expect in categorifying ideas from analysis.

Yetter’s idea can be described, very roughly, as follows: we would like to categorify the concept of a function-space on a measure space (X,\mu).  That is, spaces like L^2(X,\mu) or L^{\infty}(X,\mu).  The reason for this is that the 2-vector-spaces of Kapranov and Voevodsky are very elegant, but intrinsically finite-dimensional, categorifications of “vector space”.  An infinite-dimensional version would be important for representation theory, particularly of noncompact Lie groups or 2-groups, but even just infinite ones, since there are relatively few endomorphisms of KV 2-vector spaces.  Yetter’s paper constructs analogs to the space of measurable functions \mathcal{M}(X), where “functions” take values in Hilbert spaces.

A measurable field of Hilbert spaces is, roughly, a family of Hilbert spaces indexed by points of X, together with a nice space of “measurable sections”.  This is supposed to be an infinite-dimensional, measure-theoretic counterpart to an object in a KV 2-vector space, which always looks like \mathbf{Vect}^k for some natural number k, which is now being replaced by (X,\mu).  One of the key tools in Yetter’s paper is the direct integral of a field of Hilbert spaces, which is similarly the counterpart to the direct sum \bigoplus in the discrete world.  It just gives the space of measurable sections (taken up to almost-everywhere equivalence, as usual).  This was the main focus of Gonçalo’s talk.

The direct integral has one major problem, compared to the (finite) direct sum it is supposed to generalize – namely, the direct sum is a categorical coproduct, in \mathbf{Vect} or any other KV 2-vector space.  Actually, it is both a product and a coproduct (\mathbf{Vect} is abelian), so it is defined by a nice universal property.  The direct integral, on the other hand, is not.  It doesn’t have any similarly nice universal property.  (In the infinite-dimensional case, colimits and limits would be expected to become different in any case, but the direct integral is neither).  This means that many proofs in analysis will be hard to reproduce in the categorified setting – universal properties mean one doesn’t have to do nearly as much work to do this, among their other good qualities.  This is related to the issue that the category \mathbf{Hilb} does not have all limits and colimits

Gonçalo’s paper and talk outline a program where one can categorify a lot of the proofs in analysis, by using a slightly different framework which uses a bigger category than \mathbf{Hilb}, namely Ban_C, whose objects are Banach spaces and whose maps are (linear) contractions.  A Banach Category is a category enriched in Ban_C.  Now, Banach spaces have a norm, but not necessarily an inner product, and this small weakening makes them much worse than Hilbert spaces as objects.  Many intuitions from Hilbert spaces, like the one that says any subspace has a complement, just fail: the corresponding notion for Banach spaces is the quasicomplement (X and Y are quasicomplements if they intersect only at zero, and their sum is dense in the whole space), and it’s quite possible to have subspaces which don’t have one.  Other unpleasant properties abound.

Yet Ban_C is a much nicer category than Hilb.  (So we follow the general dictum that it’s better to have a nice category with bad objects than a bad category with nice objects – the same motivation behind “smooth spaces” instead of manifolds, and the like.)  It’s complete and cocomplete (i.e. has all limits and colimits), as well as monoidal closed – for Banach spaces A and B, the space Hom(A,B) is also in Ban_C.  None of these facts holds for Hilb.  On the other hand, the space of bounded maps between Hilbert spaces is a Banach space (with the operator norm), but not necessarily a Hilbert space.  So even Hilb is already a Banach category.

It also turns out that, unlike in Hilb, limits and colimits (where those exist in Hilb) are not necessarily isomorphic.  In particular, in Ban_C, the coproduct and product A + B and A \times B both have the same underlying vector space A \oplus B, but the norms are different.  For Hilbert spaces, the inner product comes from the Pythagorean formula in either case, but for Banach spaces, the coproduct gets the sum of the two norms, and the product gets the supremum.  It turns out that coproducts are the more important concept, and this is where the direct integral comes in.

First, we can talk about Banach 2-spaces (the analogs of 2-vector spaces): these are just Banach categories which are cocomplete (have all weighted colimits).  Maps between them are cocontinuous functors – that is, colimit-preserving ones.  (Though properly, functors between Banach categories ought to be contractions on Hom-spaces).  Then there are categorified analogs of all sorts of Banach space structure in a familiar way – the direct sum (coproduct) is the analog of vector addition, the category Ban_C is the analog of the base field (say, \mathbb{R}), and so on.

This all gives the setting for categorified measure theory.  Part of the point of choosing Ban_C is that you can now reason out at least some of how it works by analogy.  To start with, one needs to fix a Boolean algebra \Omega – this is to be the \sigma-algebra of measurable sets for some measure space, though it’s important that it needn’t have any actual points (this is a notion of measure space akin to the notion of locale in the topological world).  This part of the theory isn’t categorified (arguably a limitation of this approach, but not one that’s any different from Yetter’s).  Instead, we categorify the definition of measure itself.

A measure is a function \mu : \Omega \mapsto \mathbb{R} – it assigns a number to each measurable set.  The pair (\Omega,\mu) is a measure algebra, and relates to a measure space the way a locale relates to a topological space.  So a categorified measure \nu should be a functor from \Omega (seen now as a category) into Ban_C.  (We can generalize this: the measure could be valued in some vector space over \mathbb{R}, and a categorified measure could be a functor into some other Banach 2-space.)  Since we’re thinking of \Omega as a lattice of subsets, it makes some sense to call \nu a presheaf, or rather co-presheaf.  What’s more, just as a measure is additive (\mu(A + B) = \mu(A) + \mu(B), for disjoint sets, where + is the union), so also the categorical measure \nu should be (finitely) additive up to isomorphism.  So we’re assigning Banach spaces to all the measurable sets.  This is a “co”-presheaf – which is to say, a covariant functor, so the spaces “nest”: when for measurable sets, we have A \subset B, then \nu(A) \leq \nu(B) also.

An intuition for how this works comes from a special case (not at all exhaustive), where we start with an actual, uncategorified, measure space (X,\mu).  Then one categorified measure will arise by taking \nu(E) = L_1(E,\mu): the Banach space associated to a measurable set E is the space of integrable functions.  We can take any “scalar” multiple of this, too: given a fixed Banach space B, let \nu(E) = L_1(E,\mu) \otimes B.  But there are lots of examples that aren’t like this.

All this is fine, but the point here is to define integration.  The usual way to go about this when you learn analysis is to start with characteristic functions of measurable sets, then define a sequence through simple functions, measurable functions, and so forth.  Eventually one can define L^p spaces based on the convergence of various integrals.  Something similar happens here.

The analog of a function here is a sheaf: a (compatible) assignment of Banach spaces to measurable sets.  (Technically, to get to sheaves, we need an idea of “cover” by measurable sets, but it’s pretty much the obvious one, modulo the subtlety that we should only allow countable covers.) The idea will be to start with characteristic sheaves for measurable sets, then take some kind of completion of the category of all of these as a definition of “measurable sheaf”.  Then the point will be that we can extend the measure from characteristic sheaves to all measurable sheaves using a limit (actually, a colimit), analogous to the way we define a Lebesgue integral as a limit of simple functions approximating a measurable one.

A characteristic sheaf \chi(E) for a measurable set E \in \Omega might be easiest to visualize in terms of a characteristic bundle, which just puts a copy of the base field (we’ve been assuming it’s \mathbb{R}) at each point of E, and the zero vector space everywhere else.  (This is a bundle in the measurable sense, not the topological one – assuming X has a topology other than \Omega itself.)  Very intuitively, to turn this into a sheaf, one can just use brute force and take a set A the product of all the spaces lying in A.  A little less crudely, one should take a space of sections with decent properties – so that \chi(E) assigns to A a space of functions on E \cap A.  In particular, the functor \chi : \Omega \rightarrow L_{\infty}(\Omega) which picks out all the (measurable) bounded sections is a universal way to do this.

Now the point is that the algebra of measurable sets, \Omega, thought of as a category, embeds into the category of presheaves on it by \chi : \Omega \rightarrow \mathbf{PShv}(\Omega), taking a set to its characteristic sheaf.  Given a measure valued in some Banach category, \nu : \Omega \rightarrow \mathcal{B}, we can find the left Kan extension \int_X d\nu : \mathbf{PShv}(\Omega) \rightarrow \mathcal{B}, such that \nu = \int_X d\nu \circ \chi.  The Kan extension is a universal way to extend \nu to all of \mathbf{PShv}(\Omega) so that this is true, and it can be calculated as a colimit.

The essential fact here is that the characteristic sheaves are dense in \mathbf{PShv}(\Omega): any presheaf can be found as a colimit of the characteristic ones.  This is analogous to how any function can be approximated by linear combinations of characteristic functions.  This means that the integral defined above will actually give interesting results for all the sheaves one might expect.

I’m glossing over some points here, of course – for example, the distinction between sheaves and presheaves, the role of sheafification, etc.  If you want to get a more accurate picture, check out the paper I linked to up above.

All of this granted, however, many of the classical theorems of measure theory have analogs that are proved in essentially the same way as the standard versions.  One can see the presheaf category as a categorified analog of L_1(X,\nu), and get the Fubini theorem, for instance: there is a canonical equivalence (no longer isomorphism) between (a suitable) tensor product of \mathbf{PShv}(X) and \mathbf{PShv}(Y) on one hand, and on the other \mathbf{PShv}(X \times Y).  Doing integration, one can then do all the usual things – exchange order of integration between X and Y, say – in analogous conditions.  The use of universal properties to define integrals etc. means that one doesn’t need to fuss about too much with coherence laws, and so the proofs of the categorified facts are much the same as the original proofs.

Follow

Get every new post delivered to your Inbox.

Join 45 other followers